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Abstract of JP1 025491 2 

PROBLEM TO BE SOLVED: To naturally read 
the data by flattening the data tree where the 
data extracted from a hypermedia document are 
built into a linear document and then turning this 
document into a format. SOLUTION: A web 
printer 17 accesses various web sites and gives 
a command to a web reader 34 to secure the 
connection to a web via a web server 35 and 
based on the information stored in a personal 
news profile 19 and a site profile 20, so that the 
data are retrieved from those web sites. The 
reader 34 sends the retrieved data to the printer 
17, and the printer 17 uses the received data to 
assemble an extracted data tree. Then the printer 
17 flattens the assembled data tree to converts it 
into a linear document and then turns this 
document into an output format via an output 
interface. 
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flbftJCIcMK 



(54) [$ZW(0%m V—frYVf Y • • Za-X^yXfA 



(57) 

-So 

(1) 

(3) h 
-^CUft^U (4 
>f T ■ y v^SriBtt 

tzir^zflM b T K 

iSL, 7;*-—^ s/ b 
mAitb**=L*> b 



^ - x ^mWM^j Id T ^ i? ;* 1~ 

St, (2) !)x^F7K^ 
{CT ^ir^-T^, J; 9 y 

) /i^-^ffM-rsw^-- ■ ^ 

^!»o-KU (5) ^^-y--r 
^ > M:Mt, (6) femtsft 

£T\ (1) - (5) <7>&J!4:«!3 




(2) 



0-2 5 4 9 1 



[«F«FW*<7?|8ia] 

l»*s i ] < * 1 1 w ^f^r- 

/ >4<i:tlo^^-^7 f >fr' K^r^y^hlw 

roatH^"^ . ^ y —&mffi K^^^y htcspta^t-r 10 

*<DBB b**^*^ b&7*—^<y blkZtitz K*^ 

/y> h«iM^ blc^ii-T^r ^^^m^i-^ii 
#jgi \zftM<o*f—9 7*—' ^y h^ffe, 

1 (JLEi^-r — * 7 a— y H^fe, 20 

icffittw-r*- ^ 7;*—- e y bJj& 0 

[»*3S5] itutaw^- * ^V7 • K^^yh 

— ^ t z T 9 ir ;* i" 5 X ^ t , 

Ktc JzoTmjE^-f^— • -r^r 7 • y co^ 
h!7-^_LcDiM h^lrt^Il 

^£D=3^> kj&^^*< t fc lo^^iij^aai-r^xg 40 

4 Mzi£t£>Z>^ /Hit Sr^rffli" £ r £ £r4#W 

[»*3S8] striae *< tt> loo^'Jii, ittjiew 

• tf^T • y y^^^yea-^-^^/ h!7 — ^ 

• ^x^r * y >^co^>t c ^— ^— h?— ^ so 



[W*3S 1 0 ] ^7-y--r b^x-^^v'^A 

7-/U K • K • !)x7'{;7 ^-fe*i-5fcfctw, x — 

HUE 7- A- K • !7>f K - ^7ir^g^&B<^ 
IttfEfSAft^^/^^^y Mot, S(rtB!7- 
/UK • K • ^^y^comrlE^^^^LT^IE!7- 
/UK • !7>T K • ^7ic7 y--t*-tZ>fz#><D$^7> • y 

^.-ifOzi^> KtciS^LrfHAfbv'^^^^^tb'r^ 

y-^£ig»U (2) ^ji^ ■ y-y^riiCT!7-/u 
K • V 4 K ■ ^^yt'T ^ir^ L, (3) ^S^E~ K^r 
A>^L, (4) a-f 3 -7 y K Ctl o t 7 K • !7-< 

^i^^, (5) ^-f3-7yK^P>'>ft<Hlo 
O^IiJ^ttffib, (6) ^^M< i: h lotomwi^m 
Aity'vyT^MZtitib, (7) ^c^fliAfb^n^ r 

im&mi 1 ] • ^f>7 - yy^o^yt 5 

mrta^^— • ^7^> r • y >^o^>t: 0 ^— ^— 

h7-^±Ot^ h^Tt7Kl/^f-^<i:, -^1^ 

• yn^T-r/u^^^tti-rxg^, 

^c^f@Affi^^ — ^ • 77 4 /lslzWtfh$thZ>T 

ffjfa^A^-^ — ^ ■ ^P77^^l:fett$ti53^y 

K^— ^t-^^T^c^^^ H^^fE^^ry^yn— K 

*<DS,m&mM K^^-^ ^ hco^tc^tsfb-r^X^ir, 
7-f/K:ftft^f:K7[) h 3-^>- KlctEoTfflAft; 

[B*«i 2] -t<©fflA*k*ixfe*rBB*^y yb« 

[W*«l 3] Utiles • ^7^7 • V>^^zn 
h 17 — ^ {"j; !7 — jv K '7-fK- 



(3) 
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-raises i iiciettois«psaL7^-^ 

fcfc<a^-*&^tr::£*»»£^5iS#Jgi l KKtt 10 
[!8*Sl 7] ^-^(lT^ir^-r6/c^C0Rijf£^^ 

— ^yh-r^. !7— k • k • *y^7* • -0-4 V - if 

— ^t^^-X-r-krfcor, 20 
$3L7* . y -^^^^ r>m^ .f^h. K7 

■ 17 K ■ 5 Soffit, 

Bt9mbK7-r^*rei(ii-s^nir3/-y-rfeor, 
(i) • y-y^^ibLrmjfs^M^^^Lry 

-^K'^^K' ^X7llgil, (2) r^zn^i^^ h 

stuu (3) ro!>x^hrK^ifia*3j:t;!> 

xyt^hnvyKCS^v^T, ^;/-y-^ HcX^i? 

^i-§J:5(-. y-^fc^u (4) /u-^ 

t^iBtts^j?)^, y >^ $nfcy * f^ml 

b^x7"t^hf-^^^^yu-KU (5) ^(D!) 
h^— Y*=l* >- MctMAU (6) 

-<r^r^ir^^ttr L*5*r, xaia^xes* 
r*r»f9jgu. mrie^^- — ^ ^ hfffflt-s^^rffiriESi 

ht^^ tSr«F«ti-5!7-^K • 17>T K • 9^ • if so 



4 h • f—^^m^^v-^o 

htffglifiAJB-^.-.* • /n77^y^Mt^:i: 

[«*«2 1 ] fiAfb^ixfc h***.*> bnmxit^ 

<D^^7^4 h^^^^i/^^i, o 

[fS^JS2 2] fflAffc**x*: K^ra> v H*flAft;£ 
[W*Jg2 3] flaAft^ttfc K*^;* > FttfiAfkS 
[W#Jg2 4] =^e^-;?-BE^»!9 

r)x7't^ h^^yH\ *3«ttJf3.— -fl£m(D7*— 

Somjf5^^^-y->r h7 Ki^^is^tcs^^r ^^y-y- 

mjlEtttB^-^ ■ ^y-^n^K^^>>-ho^tc^±a 

^-if^comrfE^ni^-y-^ h7K^««i:fe5it 
^7Ki/^^7^ir^^^TLt9tT\ y^^n— K 
life J: ¥±!flsxa «r » 19 ig-T ft <Dnm t , 

[19^:^2 5] SfriEfiAfcSixfcK^f^^^h^ffi* 
[ft*3S2 6] -tcOtH^^fiii^y y^-CfcSri:*: 
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[»#«2 7] ^(D&jjmmzT^xy^-X'frz- 
*r«g*&sxgo 

3^>-K«r$tfwi:S:««i:i-SW*«2 8 (cfEfflctZ^ 10 

> t e - ^ - HfT^Tffi^SXSo 
[ffl#Jg3 0] !7— /U K • K • !>x^±ro^-y7 

y > n - K $ tt it itu 15= ^ - * IE* £ »#j-r £ » 2 £D 

--^8a*^<BAfbSnfe*rBB<o?Kt^7^— -^s/ h-TS^t 

[»*:«3 l ] MM, fflAfkSnfcSrSSr^y >-^- 

[W*g3 2] ^-y^W^- ^f>T' K=^a 
y > h^f>fflAft:*ttfc K^y > Y\cy*—?y hi~ 

i&m^4s<— - y^r • K^y > h^es^f^-t- 

BftlE'W yx>rT ■ K=¥a.y > hcoMsSSrffi^i- 
5^co^rtfit£xat, 

✓MV*— . y^T • K*a^ > h<7)fl|atlcS<5V^-ca: 
ffifflS5£XS£, 

BME/^ . y^T • K^ra^ > h^bffiUSixfc 

^r — ? &mAlt£infz K^y > hcoff^^^^— -v- ;y h so 



6 

at, 

^_.^^ r . p^y > htcr^ir^-f^r ^-fe^ 
*^y v h^bffiflEx-^Sr^sil-rs^xai:, 

aT^^^nfctfjiEx-^^fiAfL^ti/c K=^y > h 
o^tc^^—e y h«7t--7^ nxat^jMi-r^ 

[»*:jS3 3] Hd, fflAfb:^*: K=^y > V*ky* 

y v h-r^fc^cD^y > hxa^JMi-r^c t &w«t 
II»*:jS3 4] BtriEffiBlt^xa-eit^^nsffiHii 

[«*3S3 5 ] H9IEffiH*g^xaT*jtS$tt5fi:tt 

y y— * ■ n/r-^r-fc^r t^^mt-r^m^ms 2 

itufEiiilfEW^— ■ ^fVT • F^a^M^rKi/ 
^IftttH-t-SXai:, 

^^ai~S^4?>OfeiS*SE«rfi«i-*-5xai:. 

HtriE^— ^^^e?^^— ^^aaa-t-sxat. 

f-^7t"^^/FU 7^-— ^ y V Ztitz K^ra> 
Vh^Mt6 7^5; h-f 5XS£, 

3 6 icmm.<D K^ay v h«ia*-jt 0 

[B*S3 8] SEC:, K*^y >- h ^r^u t^- Lfc 



(5) 
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W.£ti1Z7*—- ^y h©RSi:8ot7- ?%n7* — 
-?y hi-Sxattftflli-Sr tSr«F»ti-S»*3a3 
7 inEtt^ =l ^ >- h &LJ1^& 0 
Itmm 3 9 ] mrlET K U^^ffJfS^ >y^-r\Z7<tt £ 

$ tb, 10 

^rWS^Ji-S XSSr^rffl-r 5 w t t ~t 

t A*t5lSS:flritS r i: %«F»i: t5»*3l 3 6 20 
[!t*3B4 2] ^77^^- ^.-if ^y^-7x 
=^y^tt5r i:Sr«F«i:i-5»*«4 l f;:Ett<E> 

[W*S4 3] ^77^y^- ^— if - ^>^-7x 

-*l*1SiS<z>*~ KT-^^tLSC i:«rW«t-rSf*3ft 
314 2ME*6<0 > h*QS#fe 

IW*J14 4] HtfiE««^^~ K(i, (1) h91E^7 

3/^ . JX — if - — ^zn — 7t~7y^ 30 

t57^-^Ki:, ^M^i/a 1st. *^*- — t, _b 

E&aa^r^ =j > £ s ^/m*^-*— k t , 

(2) HfflE^^^-r 3^ • 3.— if • ^ y^-7x-^^s 

51 ££W»i:1-5ffl#Jg4 3{C|EifetD b* bt& 

IW#JS4 5] fffflEgyJvfb-E— Kt-fe5^7 7-f 3/^ 
. ;x— if . — ^7 ^ — a^/n-.^7^7' 

K^ra> > b&77$is>yTZKfcmiF£tlZ>Zk$: 
«Ff» < ti"S!B**4 4tc|Ett<7? K*a^^ h^l^rfe 40 
[B*S4 6] • T • K*^>h£ 

ffiEW^— • ^t^t • > bfrbttotis&tilt 

T b*\sx&#th-t%=*>>T-f 
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I7>f^y^t§^77>f 3/^ • 3- — if* - 4>?*-7 

7-^^^^y^^^yp-Ki, (2) ^tbfemzti 

^b^Of-^^t, (3) 7^-77 h£*bfc 

U (4) fitrE*!iS«ffi(w«eoT7^— h^nfc K 

[ll*«4 7] Mt^ 7^-7yh^tlf:K^r^>y 
b*&M-tZ>m^. 7*—*yb£titiY*x.*>b& 

[W*3S4 9] «ljE3y7ti:&57K^S: 
K7 5>^*3j;tf Kn K7y ^ • Kn 

[»*JS5 l ] tfJ7 4 y$ - ^--i^ • 
= ^*Jr«i-Sr i SrWWt-f §1**315 0tcE«O K 
IW*3g5 2] ?*y7 4 - ^-—f ^^-7x 

5 3 ] «rffia*«5^~ Kfi. 
(1) BSE ^7 7 -< 3/^ • j-— if* • 

mt&*— Kir, (2) fflE^^^^ ■ ^ • 

[»*3S5 4] ft/Nfc^E— KT*S/TSilt^7 7^ 3/ 

^ . ^_if . ^ ^^^^^^ 8(rE^>f^— • 
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[0 0 0 1] 

7h!7-^l^^-^f>f7' K*^ > h*S 
K^jlc^lSFrLr, rti^K^a^yh^b, K*^ 
^ > r*0«3S<t'*— V t7 X (personalized) 

#31 Ml** !7 — K • K * £^X±cq, fe^i/^i^^ 

• *tVT • y >^£if/tf- hf^W^yt 0 10 
i — ^^3/ h y — ^_hco N b^fcyW* 

=■ a.— *if— tr;*j&>e>, fesiMi^^v^if— tr;* 

[0 0 0 2] ^ — 3 — .AX (New Yor 

^t°-X/U(People)^^f^^^-r5^^X 

pft^whftiBfu y-^^xsaxfc-T*— 20 

bhX£Z> 0 Zhlc, ^ofiAfb^-^^sRWafiH, = 
>7^> (content-based rules) ^r'atf ^ 

Stic* feS^r— !7— KSr*-r51E*S:l»^l^ fc^t^ 
[0 0 0 3] JilBOWI(iWW^3iX(wB8]l-r6 *>(Dti&, 

TC0»»(i!7— /UK - !7^T K ■ !>^X(^B8LTfT^tLS 
[0 0 0 4] WW!>^X_kCD'MV*— • 7 ■ K^r^- so 
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[0 0 0 5] 

[ffi*©Sf] !7— /u K • !7-Y K • !>3iX(WW^3iX) 

^^n— • *t*T • y >^£iltR-r3^£tcJ;oT, K 

[0 0 0 6] ^^^xycO-M^f^^^^olj:, 
• -^- — ^ • if— t'^T^ 3.— if^r^'il^t-o^T^^ 

X©J;5^ XZhhjZW^ - =3.— ^ ■ if-T hco^f 

[0007] ^tf ^^v— ^©(a^, " >r 

V^-f tfv^T/U- * -< >^ Individual Inc." <D £ o t£ 
^^--if^^#t-f-^ o C COT-^If (D^- -l — ■ if™ 

^(i, ^x— if^^r— KlcS<5i^fiSS^«fiH* 

m^mz&kitobtix* ^-—Ficmm-rztzitoicyT 

ti*< T K^i^ >- h^ffSr^Sfc^^fl, ^co K^^ 



(7) 
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[0 0 0 8] 3.— if tfSjPrl§l!*fflAffc (personalize) Lt 

:iCJ:ot^tfc5o fe<tx.tfCRAYON (Create You 
r Own Newspaper) I*, — if^, 2 5 £A±<Dmt£ 

# 5 J: 9 M^S 5 0 CRAYON 1tt 10 

3 — y - ^^XoggRBBflMflk 

^ h y — h • -y— r^<D^i/**m, ->^7^° • h y tr 
^-y^^^^I^y >-^4:«"r5fflAft;StLfc»f 

HTML (hyper text markup language) y — ^ * 7/^^ 
[0 0 0 9] 

Ir^T^fc 9 , ^fcf9i"Sr <t(i-e#5«t 5(-i"St<^ 
IP*>, Ate, IrZ^bJ&iMto^—is&jk 

Ati, WaSW^AtJ^^i-St^-e, ^t^o^ 
[0010] gi-Sd, fe^to-^-;* • ««-^— trj^ 

[0011] 

O^oOW/N-^fVT' K^y^M-Ty-t?^ 40 

^tbfc^— y • y y — £16712(1 inear)£> K^y is hcO 
K*n.y > ho^dy*—- h-f £r <btcj; 9 ft 3 ft 

So 

[0012] sijoffii^joi^T, • y 

t • !)y^^yta-^ • ^ hy — p^h*?— so 
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y^r^i" 5 fflAffl-^-— * * /d77^/^ 
fls£t><7>-c-fc6o tc1\ ^rtfVN-f • y-r^r • y > 
y (T^^t^-y^;/ h <7 — y T^T^-fer^ft, ^S^e- 

— k^bb*6$h, • y^r * y >y 

~— hy-yte^-w K-cfflWfSft, '>*<£fc 

So 

[0013] s (b^ucDffitc^o^T. ^jswiac^Xir- 

h#l?£ ■ T^— #&MisXTJ*<Dlt1to<D, fflA{fcXn7 

y^ k * y^y^ry ir^s^tcA^ft, y— a- 

<BAfb7r^/Wi»jW*n5o 
[0014] $ ^icS | J^offi^c^^^/^T, ^^^^^ 

— • y^Vrtcy >y l/z^ y^. 3/ M7 — 

-^3/ htSfc^t-feSo »ito£jh,;fcfflAJfl — a— * • 
7"D7r>f^S$n5. rcDfUA^^^. — ^ • 7*a 

'^7^7' y>y(o=r^t°^ — y 

^b-r— y £r y-tr^i-5/^^corr^> Kt"— y , *5«fc 

^ • ^7°n y r-r/H-t&^^tL^>r K^f-^C 

^ ♦ y'n^x-f /Kcnifi^^i^^^y K • -r— y 
(-g<5^Ty*y>c3- k$^„ ^^yy K^n- 

K^^yyhfi, fiAffi^^ — * ■ y°ny T'iMz.i&m 
[0 0 15] ^ ^tc^rj(Z)ffitci^ol/^X x ^P^i^:, 17— ;u 

k • 7-f k • ^ai^-y-^r M>fbx--y£^u 
t 2 '— y^, iAf^nfc k^^^ > h^ft:7^— 

So **Wfi, (1) 7Y K- ^xy|^ c 

<Di7^7t<D=**# is a >-£mcxmmi-z>tiisb[z^ y 

• y-^^H^L, (2) ^xyf^h-TK^ 
*«4bt;i:!>xW h • ^^K^IU (3) 



(8) 
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r>^y- y-^tefft^U (4) /u-^&BfSL-tZ 

u- KU (5) *<Dt/^-?y-<< b • 7*-*£iR7g 
a^yhl:»«U (6) ftjftSftfc*^ >f h • T 

5*"C\ IS (2) ^blS (5) 4r*Sr*<9igU, 10 
LT (7) 7t- ^ M»«t-«<S^r, S»?gK*~> 

> h%mAit£titi y**.*>> v<DM\cy*—*y b~t 

[0 0 16] * bf-S'JOffii^^-C, #3£9!rt:. !7— /V 
h • T Ki/^*«, ^ — WS8Lfcl>i^ h * =■ 

stu ^ft^m-r^^^y^^ b ■ r kix^w 

y-^S:e»i-Sc • y-yj* x ^-if 

5:£-t<Z>T ^ir*£ih/T L£ 5 £ 

[0 0 17] $ 6(-S'J<Z>ffilC*5V^T, *7— 
K ■ K • *y^Ai<D3r> : 7'{ > • — ^ • — t=T 

7 • ^^fci^^v K ■ t£bTf 

{cffimy*— -*y b - K^bftSfflAffl^ * 40 

hi"SJ;5K* y-^&ibU fiA^ 

— 3. — * • zfxny 7 ^M^M^^titz^'^>' K • ^ 

• y"* y r ^ Mcfeffiztitzmmy y h • so 
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[0 0 18] Sb(CglJ(OB(C*3V^-C, s»4s< 
- • T • K^^yF^ fflAfkSttfc K^a.^ 
v b<oj&\zy^~- ?y Hf5t^t*fc5o ^^f^- ■ ^ 

-^t'^7 - K^-^-^ > F(7)Ml:lo< J: 9 (cftt/E 
v b<DM\cy*— ? y b-tZ>tzMz]fej£;£tLZ) 0 * 

i^t^oTfgAib^ttfc K^yrf V h^tc^^— h 

[0019] ^ p>[csij<^ffi(c*5i^r, 

•^f^7« b^zL^- 1/ b%*&m-tZ>Tz#)<Di/*'rJ±'T: 

^7F>^^t±iL, /n^^n°— . T • K^r^-^ 

titer— f&mm-r&fzzbiz, fei77^'>3^g 

[0 0 2 0] #ffl*HJS?gffi(c*Di^-c, c^y^r-A 
(i. tfyy^yt • a-— If • >f >-^ — — ^^ilL 

ti> ilfcot, ^H^ififfl^ry^yay 
*e»i-s*jLST>r =«^^?>RfcSo ?yy j y>? * 

»:<0^~ Kfi, (1) ^77>fs/^-a-f .>fy^ 

^^s^ 7^- bt5g«, t&mJry'i/B 



(9) 
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k, mmz (2) ^77^^- ^-if • >r y 

ft/Jv=e— Ka>ibfiE£o — ft/J^— KT?!IffiS 
TFZthtzf^y j y$ • a-— if • -Y y^ — :7:n — y>H 

77 4 • 3-— if • >f y^-7x-^^rI®^Tt^ 
^ (irtcj; 9, ^>T'*— -^f^T' K^e-a. y y hcD:/^ 

[0021] r<z>®Rjia, *&mfr^fr{cmffi£HZ> 10 

[0 0 2 2] 

h • ^y K^X^cD^y K^®o^W^£PCSgl 
3o =t>-t: c ^ — ^iii — ^e~y — y 

— **5<fctJ^— if • = -^y K&fT^iMftf:*)^*--*?— 

[0023] = yea- ^ili n y^;*yK 

^ y h T 7 — 9 - 4 y^-7x-^ 1 1 a CD J: 9 CD 

i^t^r i:dST*#5 0 «>f^-7x 

!7 — ^ (LAN) (d, £7t!3:!7— A- K • V K * 
$3l7<D£ 0*£V4 K^y T*s/ h^ — ^ (WAN) 

[0 0 2 4] »20ll = y^ — *§8«l<Z)rtffl5flKfi 

fC, :nyt c ^ — 1 ti, =3 >t°rL — y • s<x 9 Id® 

j*£fr5**&S^-=:y b (CPU) 8£-^£j> 0 40 

7x-7 6 , y 2 ^ ^^"U-r - 4 — 7 iL—X 11, ^ 
y V !7 — y • -Y ^^ — 7^ — 7, 11a, =*r — tf; — K • 

y^-7x-7i 2, -7^'^y^-7x-^i3, 

fc**L-P*i=y t e ^ — y • 9 KSSJBcSnSo 
[0025] ^y^^y i4ii flAffi^— * • y* 

uy T -f ;u . xrV ^16 %ti\^3LZf • y*y yy 1 7 
<D&lteT7!)lT—isa>&mn-i-Z>k%\Z % CPU8 
l-«t 0&ffl£*t5*:*!><0 7 y^i* • ryiry. • y y • 50 
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■ T7* ] J tr—is 3 y £\ T^^^K7^y5^b, 

^y* y ^e- y 1 4^o-kl, ^1x^^77 h^x 
t • T^y ^-v-a y-<y • ^^y 1 4±r-n^T 

^^16, ^xy.yyy^l 7, HTM L 7 d ? y 

$ 1 scdj: 54^7 h^y^r • r^y^-— ^>3 yii 7 

>-n— Ki-sc tie J: *9 , ^^i^^>-t 0 a — >?tB^tS?& a 

[0 0 2 6] Y=7 4 7^\%, "T** V * 77^f 

/u*3 J: t/lf«7 r r ^> t 2 — ^ 7 r -Y 

l. ^LT±fEtD^p^y^ h^r • r^y ^ 3 

y'7r^^»i»t5o -^cov 7 V V^T • T7 P V 

DOS7/y^-^3y, *3J;tWBAffl= 
*7r>f^l 5 ^r^atfo fiA^^^— ^SlSt^r-OH 
5fl fiAffi^-^ — ^ • y r/f;u • ^16, 
?x^./yy^i7, HTML7t-77^18, ffl 
Affi^^. — ^ . /D77^/H 9 joJ:I/if^ h • 7 P P7 
7^^2 0^tf o fSAffl^^— ^177^^1 5CD 

[0027] (K^ayyh^IS) *3Hii»3 

^^^fe^cD^fT^HI^-r^o I3AK1 

CDT^^o • if-f h 2 1 CD^t-tt— yirj v^2 

3 cd J: ot£<0"ry? cd y ^^^5^ — 

^2 2^u, r^>r >7 f s/^^ti»:(^lE*2 4tcy > 

z> 0 ie^h 2 6 \%hor>t^co^^ • -y->r h±(-*>s 
cdt\ y y^y 2 5^-y--r hmt^fy >^-^fe5 0 uy^2 

y^^r- K^r^y y h^^cDj: 9 i-^^ccD^^y • if 

[0 0 2 8] Ts^^^T" • if-T h 2 1 ^^b^^-T 

^fc^ic. • if-f h 2 1 

^tifrh «t «9»ifffl(wttwsn5*^ m^cD^iijfi^^ 

y • iM b 2 1 <0«at(-, ^^l^^^y • if-f h 2 1 CD 

ffla^-^ • y y-^twft^tHStt, r cDy y 

13 Biat^-TJ: ^^^-^ofltfigSrftfip-t-So 1^ 



(10) 
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[0 0 2 9] Wafflx— * • ^y — 2 7CD«$tt^<oa» 
oWtRSr^-TSo *tco^ 1 i: Lt, ttfflT*-^ • 7 y - 
2 7(1, t>T h 2 9 ©fc*© 1 oWl^fty - 
1 5 - i ©-e# h 2 8 Srtt 5 0 ~<DS— Kll, 

JE(-. ^>fy^V^^^2 3^IE*y-K3 

l #£*{c*tt£-r£^ vf^^^y- K3 0 
^:^t^^o ^2(c, «tts-r— * • 2 7d, 

y — -e&5o fi^r^(^3 aei-cii, $-^-^22 10 
7^, >f ^f-j/n • k# 1 £T\ #:i::ffi*c* 

T\ ^H^P)*- i^— ^2 2 (CMS^— ^SrH^i"S 
^ ro/^^aif-^ • 7y-2 7 «rfts5£:#{c 

[0 0 3 0] I2t LT, ftfflT^— ^ •711-27 <Z>« 

K> , r^rtfj:, — ^^V^^A^-y--!' h # 20 

1 ^bi^Y h««y ^^2 5 ^iiCTHS^H 2 6 (17^ 

[0 0 3 1 ] mm^, fac-mRL-fzZ k^fo&tiK 
fEVfi, i>x^ • -y-^f h 2 1 <D«|}£(£>/i #>(;:, &&VM1 
l/fy?*/^yTj 3 ^fS^2 4 (D\Hm<Dtz 

fed, T-^-7y-2 7^b^$titP5o 7t irx. 
tftE^EirGfl. JSffi^- * * yjJ- 2 7^ibl&^£n 
Tlr^o **Wfi<tSi:, JSfcb^— * • 7y-2 7J1, 

E-cift^S^ru^ct oteWkMV*^* is h 3 2(C^±B 30 

fctt&ttjT*-* '^y-2 7^otta^-f 3£ ^ 

[0 0 3 2] ft«tC, iRTK K^a^y h 3 2fl, 3-— F 

Jl, -y-^ hco^-^K ^yf^V^^^^ IE 
**iri^ov^T, $*?*47t v h*3«fcTJ*fe (£/cll 40 

K^-^y yh3 3ii^-^i:»fflSn5c 

[0 0 3 3] ^^^"^^^v-^T^^^UtOSI^ffitC 
*5^-T\ JiifiLfc. $=^7 • h 2 1 frt>y*—~ ^y 

t5^t^#5o fc^tf, ^'^1^2 1^ 
J3"J<oy ^Sn^y* Mcfe^T) twioV^T, ¥ so 
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mt^titz K^rrLp* ^ h 3 2co^(ca:»usi3!ii-sr. 
<k^T-#5o JSUoisifi^ffi-m, rttox-^ • yy-2 
711, 7^-— h^titz K^^y > h 3 3<om\z, m 

il, !>x^J:o^-^^7' K^ra^yh^ 
[0 0 3 4] ±lE»ttT*ili-<^:J: 

»siuj:fcj:t«t!i<0ra« (7^--— ^y bmm<DX it*) & 

!g5[E](l, (@Affl^^-— ^ • ^o^r^^l 9^3^£ 
* 1 — * •7 p o77'1';H9, i^-f h 

. ^C2^ r ^/U2 o^i^^x/ • y — 4 bm%&& 

[0 0 3 5] {SAJB — a — * •7 p D77-f^l 911, i 

^r#i-5 0 flAffl^^— ^ • 7 r 4 ^(ommfiH^m 
l "CaftW^nSo 

[0036] ^h-7n77^f^2 0lt 

ti\ ^-r h •/o7r^^2oij;, -i-^^-y-^ h • T K 

D7 7 >f/^W#ilTM^il^ — ®W*f">f 
htiMfttSiM' h • ^n^ r ^yu2 0 t^ft#j ^ixT 
T\ {®Affl-^ — * • *?Wr4 fr \ 9(1, t'f h - ^ 

««r#fiaL, ^r^/c^i', fHAffl^^-^ •7 ? d7 7 ^ 

xmmtsfo^&b^, mAm~~—x • 7"ti7T^^i 

9(l-y--f Sr#Bg-T5^i:^-e^5o h ■ ^ 

O77>f^2 0ll -y--f 1 ^"http//www.sjmercu 

(1, -^^-y-^ htta^^^t^ft^^o '^oT 

fSAffl^^-— ^ • ^077^^1 9 ^Sfrcot>^(-t"^> 

ItisbiZ, f-Yh-7 P n77'l'^2 0 ^(tdSJEStlS^ 
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[ 0 0 3 7] *7^zf • y 4Ji, • 

3 s&fr^xmv^zfkmis-rzTyytr—isa^ • ^ 

y-^3 4fl |Affl-^-^-/o77'f^'^ 

SrU K*^> > M>b7*—*«:8l3!f U ^LT^ 
^16 (^^^ 0 

[0038] m4 0-eifl 1 wsixfcj: otc^ mxm=-^- 10 

-3 6, r>3i^ - y— ^ • ^>-^ — 7^ — X3 7, 
Vyt ft* ' -r^ — is-Y 3 8#3«fcT/:7ji— ^s/ N ■ 

. xf> * 1 6t!>x^ • y —^3 4 Cl-f — ^zn 
~-^£-£5o b • K7>f^-3 6fi, • y — 

^ • >r >?—y^— * 3 7 ^iiCT>>^^ • y —^ 3 4 

7>h7^ h ^y^-7x-^^t5o J:^a 20 
cv^t, t-Yh- K7-f/<- 3 6(i, . y — y 

[0 0 3 9] 1M h • K7^^-3 6f*. t^f h • 
77>f^2 0£l&^3lT6B#fc, *ti*:tiT 

tzv^y - v<Dm^, -y->r h • ^077^^2 0 30 
T-3t»stbfcis— ^fltig^aeijiL, ^^^btfesR^n^ 

u ^ tt^^y • -y--r h^bof-^^t^ h ■ 

[0 0 4 0] ^077^^-^-^381^ fflA^t: 
$tLfe*fBHS:^J: 5 fcy y h-rz>fr&1%fe'i~Z> 40 

oT^, 3.— if^, mm^y^-— hi-fifcfccofflA 



[0 04 1] fiAffl - ^ — ^ • 7 9 n77^^ * *-*f 4 9 
16, K^>f'<— 3 6, 7°P77^^ • — 

>^ 3 8<Dtz$><D'9*>'7 , /i'<D = *- K*M^»3 A(d^*tt 
5o AEI*3J:^5 BEIli, <BAffl = ~ — * • :/n 

p-ff-Ft'fc^ S5A|it flSAfflx^ — • 7° 
o77>f/H9©H, • -y--f OT^ir^*5 

fflAffl=~— * • /n77^f^ • ^7* 
4 # 1 6^»^SrttWi-^ 0 
[0 0 4 2] $5 Airo^fi/yS 5 0 0tC*3V^t, 
Affi — a- — X • /D77^^' ^ 1 6^^- — ift' 

«t V*?— h ^tL^o ^//S 5 0 1 iC^o^^T > 

^^m^y- !)-y3 4«rfil)St5 0 ^— y-'^ffgA 
I DWn/)^^r^/S 5 0 2 Izte^^xWimZtl&o 
fgA^-^-— ^ • /u7 7^/M^(D I . Di^Lta 

X7^yS 5 0 3^^f7yS50 

* . y T ^f /w^^?£L^ittv^\ fSAffl-^ — ^ * 
/D77^^ • tt^f^7 P S 5 0 5 CD 

^^e-K" CA6c — KtrA^i:. flA^^ 

a.— x - yayr^^ • ^"f-f 9 l 6(i^^-/^S 5 0 

/N^y^- ^f^r- y y^*«»ft5tv^ xt-^ 

>K) £A^U -?:(7)=2^^ KS:, -y--T h • Yy<4^— 
3 6«rffiot9x^ • y— ytcteiH-T^o t>f K7 
-f/^3 6it (7 31^ • y — ^3 4^ttT^fc^^^ 

ti, mxm^^—^ - -fvyr^^ • xi T^<? l 6(i, 

t^h^7'7^^y^ (rtSfpo^-r-'-?— y >^_h£^ y 

[0 0 4 3] ^JBIiJIi, r>xiy-y-^ hc9 

xl— if^^^o^ j,^ ^/^-/t^ >^coT(- 

yb, F^^i-S^< (content-based) (IP^, !7 

*(cffi^-<#r t^T*#5, (2) *>S 

RSr*T-rSE*«rBft^i-Sr ir^-e*5, (3) mfS<^ 

^-/^a^^^^tt ^ : ^t*^, (4) # 
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IE*^>il«S:S*"J-SZi:^-r*t6, (5) j££r 

OJ^ftfeS^'f ^8E**l»^i- 5:im5c 
[0 0 4 4 ] «a6WT-t LTrt«(-S<5< content-base 

[0 0 4 5] ^<t^, rcD^lUtl, — IfOiSifc 

O^rf^i-^fc^tc, — F&ISfel, 20 

& 0 c^s-m, ftic^cbr b-isxmntmisctD*. 

f^h- K7^/^ 3 6 I*, K^^tiM 
[0 0 4 6] ^xs/^S 5 0 8-m, ■ =i-^V K 30 

^e^bti^aiusr^a-rssflij^-^tt, 13 bio 

£ 5 *ttttl (extracted) -r — ^ ■ 7!l-2 7^J: 5 

^coy^hr*fc6 0 ^f^^ssogrii, f^m 

TIT (BP*>, a^r)x^h»TLT) 

y/S 5 0 6 CI^ »TLt^n«:, 7DHi7fs/ 
/S5 1 O^iftfo 
[0 0 4 7] Z<D&X\ {SAffi — * • ^nr^r-r/U 40 
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[0 0 4 8] 5M<Z>JBI6-CI*, ffiAffl^^— ^ • 7 r 

mtztf^*-9-Vfj:h*&mm-tZ>Zb{z&K> , ^-if 

[0 0 4 9] ±ffi^fftWS:ct 9 <t<3ltti-5fc«>^, <B 

Affl-^— ^ • y^yr^^<DmMMW^^n^^ b * 
sBmxmmzfri&xoic ^mitm/^x\ ^-ifci, 

^7 ^ ^ oSrP^^ ^ >- V b St«jE"T 5 ^ Oil^^ 
4t5 0 BP^>, ^f7/S5i it% jPrHffl^v^u— 
h^i^t, f[§Affl-^- — ^ • ^n^r^/n 9 

^/^S 5 2 0— Mfti-5fr<Dm%l&-$'?Lbtl&<> 3-— if 

^ry7 p !/-Fwfii^t^^ £7tf*^£>*fM 

— (^^^^^S 5 1 3^iltf 0 

[0050] ^^^/^s 5 i ifti, 
ry/s 5 i 9tc^6^r, tri^mxm^^—^ • 

77-f/K:i^^ 0 o l,tz< fa^m&fo, 

r^/ss i ^zxay^—^yb^fj? 
3 9 7Wt>*m£n£ 0 

[0 0 5 1 ] y^—^y b^^f ? 3 9fi. ^--if^^: 

^ >^^SfP^i-SUB'J-r-<#^\ £<D$ 4 b<D 

^*fcfiif*>bt^-— !f^^aT*#5«t pCt^o IO 
^O^^^if-f hO^^.-^l5*^^l LT 
*\ ^O^^^if-f hcona.— ^lE^PSrt'i? 

*ilt^-rSl<b35ST*#5o S^Ufs'^S5 16-e, 
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[0 0 5 2| ^ry^S51 7 7;*-—^ y 

^ ^ 3 9 (1 N ^^—yt^co s -fyf^^^Ayr^ 

3/ hoixV^ 3 9(±, if -O^^^^/^^x 

[0 0 5 3] -MttOflja^^^^^ • T^^U— b 

-3 8Uf:, :^^^A7t-7i/F^ryyi/- 

<OJj*# A • ^ V^u — h^tD7K-f ^L< fi#»S 

[0054] ^T^S52 Otti, f^Affl — ^ — • 

i 9 y * h ^^^if ^ M»biE*£8l3JH- 

W\ fiAffl-^ — ^ • /n7r>f/H 9 (^fe^grf^^V 
-fu— biz^^^X?* — v y b £fri& 0 7*—~?yb 

£tifzmAik£Htzmm&^ftfrbm^th%$\ mx 

if^fi, ^-fli^oT'():©^x7xa-^^|y7 
[0 0 5 5] flAffl-^*-^ •7 P n77^f/H 9 S 

5^6^^- ^IS^^^^y^n. — ^ • -y--f g 

[0 0 5 6] (fiA^^^ — ^ • ^n^r^'i'Srffl^jfc 
K*^vh<D*£s$t> 31 6 0(3, #3B9l3HBAffl = ~- 
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Eli*. Sfc. mmztitzmmmBcn ^ h<7><*> 

KW*lx5» ) 

10 -Yh7 P D7 7 ^^20, ^^~f • y — y 3 4. *5j;U!fcH 

[0057] • ^y i 7(i, ¥ <d$ 

/n77^/H9^5 0 • 7°y 1 7fi£ 

fc, — jRft*^ bmm&^&xy-'l b • /0 7 7>f;i' 
2 o ^jiSo v^-f -/y^n ti, m^<D^^y • 

i^-f h (IT ^ ir ^ L, t<D^^ b^b*?— 9 

bwx\ mxm^^—^ • /P77^f/n 9^lwz^^ 

20 h • ^077^^2 oi:fc5W«lcfflor, ^zn^-tJ— 
3 5*aiC-CC7^^(C3^^ h"T5 «t ^tC^^ny • 

tty*—#$:$=^y • ^y l 7{zm& 0 V^-zf-^V 
1 7«^f-^^LTiHif-^ - yy-S: 

ii^^r^o • ^y >^ 1 7(1, ^en^ib. 

^lt, ^co^ K=¥xl^ v h^tii^^V^ — ^zc 

— ^4 0 ^iiisrtJb^iffltw^^-^^ h-f 6 (fas, z-<d 
wmm<o&(Dmizte\,^xi:vmm{zmw£tiz>) 0 

30 [0 0 5 8] S6ifKK$n5J:5lc, r>3iy.^y 
>^i7fi, ^x/ - y-^-^y^-7x-^50, 

3-— ^^^tf 0 v^--? - y — y * 4 >-9 — y^—^ 5 0 
±.mcD$^y • y * y?—y^—^ 3 kd^ 
mi, 931^ r - /y >^ 1 7*^^^ ■ y — ^3 4 \zm 

[0 0 5 9] i^-f h by^y<—5 lit., F/p77 
-f/l-2 O^Octl/fiAfflxi a.— ^ - /ti77^H 9 iZT 

4l^»«-T5o -hfEcOj;5t^ y-y3 4(i-^ 

^m^tifi^-^fi, ^x^.7 p yy^3 7ta^ i 

[0 0 6 0] y y — • -?*~i/\~4 1 i^fflifi-x — ^ • 

^y— z^m-tZo z<d&x\ 7y-«v^^-4 

so y>^l7^ f^CE^t-— ST^ir^-T5r t&ffilt 
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5 y Sftfcy * h*#!BLt\ t^f-^^D y 

i <ntL$xr>y-^7A' = — Ktitt® 3 B{::^£ft£ 0 
[oo6i] 7^-— 4 ttm^— ^ • ^ y — 10 

flijgflii:, fiA=~— * ' 9 d^n/i 

- KC0^*ilf+S3 BlC^£ft£ 0 

[0 0 6 2 ] i^HCf piS7iit SEnBlT?*> 
5 0 ' ^y >^ 1 7^. U-^34^f 

ot\ ^©J:5(-bT, OAffl — a — X ■ ^o^r-f A' 20 

[0 0 6 3] ^x7'-/y>^fj;^f^/S 7 0 0^b 

fi. 3-— Ifcoti^L/cOAffl-^ — * • ^p^r-T/^ 

K7>f ^SlcfeilftSHfc-r^^^hfflAffl^^ — ^ • ^ 

-^-*«l (i— AW-h^)^-— iftct «9ffiffi*n5-5rfiE 
tt tf* #> 5 fc liW^i (-t<£>5 -b^UBfiT" 
7*jv h £ LTjg^$ft£) fiAffl^^-^ -7^077 
^/^3Satc»j|ft*nTfcJ:\/\ -t^S^Snf^fiAffl 

— a — * • ~fn 7r^/^ll^ HJ-r^-TCtc, 

;/y V* 1 7fi, ^r-7 7°S 7 0 2 T\ teA,bt>*<D~=L 

$fcfi=~— *K*tfSfflAffl — ~ — ^ * ^cr^r-Y/H 

9 Lt^$ix^-<#^^t5c 40 

[0 0 6 4] — 9 K^-r^5tc# 

tE-tZm-SlZte^ ^Ti/7 P S 7 0 3T\ ^<0»*rt£ft*: 

Sfts^sElt^f^^s 8 0 1 icitfT^So ^ti(->?f 

- yyy^i 7fi^^^^s 704 xv^-f • y — 
^3 4^p^r>*tts-r o r<£>»fNa* fiAffl^— ^ • 

[0 0 6 5] WXfimZtlZts ^^y- y-^3411 so 



^fi/^S 7 0 5T% !7 — yuK • K • ^^(DX 0 

^-3 5l:8ito„ ^ft*»fb, ^/./yy^7 
(±, f@Affl-^— ^ • /n77^H 9 £ftfc 

7"S 7 0 6t-SlWO!?^t^ MdSfiKSftSi:, 

fflgii,m< tz$xn^^> K/y v^sr^^i^ • y— y 
3 4t;W5 0 r^^-r^ K/y 

[0 0 6 6] y/S 7 0 8t% * 

or @w<e>*«£s*?> ^ti&v^y * >^y >^ 1 

Ttf^ft^c f@AJE=^ — * •/o77^^1 9C0SIIJ 

IS, fflA^^n^H^bT*— ^*l»^-rSfc4?)C0, « 
aSW*l*3S(-S<5< content-basedS^^fg^i-^o ^ 
^tffcSWfcgiaUl*, C^3i^ • y— ^3 4tcJ: f9T^ir^ 

W^ftfcJ: 51-, fSAffl^^ — * -^77^^19© 

[0 0 6 7] MM'J^^^^fc^{^Dx.T, Xry^S 
7 0 8 Tli^fV 7(Dl^[Cj;^^^fbtl^ IP 

4) yy— v 1 ^ — 4 K^ifj^s^nsisttj 

tTHM^iitJ; 51-. y >^ £ftfcy * h£# 
iLt7n^^(c^f5^i:^T#S 0 • ^y 

>^17lt ^T7/S 7 0 9T\ r>aiy^>f hCD^^ 

^^S 7 0 7^MZ>o X>t^(D^=r.y^4 b ^^tC^ 

^7yrs7iot\ -/u^n 

fzbbl^, 7 y -7*-^^-4 1 Srftffl IT, f@Affl^ 
a.—^ ■ /D7r^^l 9izm^>^^ hSr, ttHiT-^ 
■ ^y-icfc5*>f h»fi8««(kW;Ri-5 0 £ feK*< 

7io/>^^77/S 7 0 6i:l^ 
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i%XL$.yb, 7D-tt»80©^ryyS8Oli:a 
tf. 

[0 0 6 8] <&!3ffiLx-;?0¥±l{fc&:7*— 
Iattened)$tLT7d-— h $ ixS <7?^Sr^i-7 p — 

f^-ht*fc5o z<Dmw<Dm&\*. v^zffr^—$ 

ISIS, ¥&ffc&SlfcJ;tf:7*— ^ H«kSli. '>^<<t 

men? - t&X'ZZo 10 

[0 0 6 9] msm<D^y^ ^y"S 8 0 1 t\ W^'~^ 

co^^^/fftTJ;^ dOl&^ilfifllAffl^ — 
[0 0 7 0] 7 s — ZfrBM Y**-* v hco^(c^±B^^ 20 

y*—*y h^nT^tcHAfb^nfeSfHfiffi^i^ v 
0, fV^^M2^gMf6fV^7 P M^>^-7 

* l l , i/:l^f^/7 r y 9 ^ - -< i/$ — y*. 30 
[0 0 7 1 ] ($2^)IiiI) ■ • • HTMLC7 ^ ^ 3/ ^ 

il77>^y3>i:toTj!!ii$ii^ &mm<Df&2 

(DMMWmte, l2itM$tLtHTML7^-v 7 
^18t LTKW3*l5 0 50 
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[0 0 7 2] HTML7^-t^ l 8 OWfi, #tHiH 
A^Canon Information System Inc(C J; 19 KifiSixfe 

L — ^ y&l 8fi^7^^ - :7*— *y9<nnW&1& 
[0 0 7 3] * y*— ^y?\-±, *yhXy~— 

yx—^yZteV^y*^— i/frh'r — ^£rttt±JU ^co 

> ya^^zny ^ y h1~Z> 0 ya—TyhS 
tht' K^ra^ > HifWJ£*x67K RTF (Rich Text For 

K(MS Word), !7 — K * :7m^ h (Wordperfect) . !7 
— K • ^ K(Wordpad)^(7? N fc^^>^RTF^ ^r^'T' 

' y*—?y?l$, Microsoft ^Windows (&»iS») 

(oj;9^^yK!)»^ y^y-y^—^y^ 

>- (^FIU^) ^/7>^D^U Windows^^^— h 
*~=.-ti*bVzr.zf • 7^—=rs/^Sr3®«L, W\m<Dy 
^y* ' y "7 ix* bURL( uniform resource locator ) 

• y*—*y9 • r^^v-co^ 1 — Kpy^tS^ ^t-J: 

[0074] It^i^ail^i^ot, ^^.^ 

• 7^-— ^s/^l-i^ lfi^fcfi2fSa_hcO^^^C7)^— 

p77^^Srffifflt5-i:li4i\ trL5, fS2MMM 
n<ny^y -y*-^ry$\^ ifi$fcte2ffl«_htf>* 

^^-^^-^^iiCT>ttHS^;(cA^$n^ 0 

[0 0 7 5] «Tti 5»*IBI-RW$ix5355, ^^y- 
7;u«^- KTl^it5 0 */MM^t- K-Cli, 

^ • 7d— 7^^©^77>f 3/^ • if ^y^-7 

ttii^^i^c ^cD^j^(ij:or, ^»if^r>m^^ 
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ZZt, z?*—w h-TZ^ks teXXf-fV >• YT*7 h 
[ 0 0 7 6] • -7*— * y $ <Oy ^mSfe^— VX 

coxs- v^couRLT Ki^SrA^Ji-fcfcifcxZM — 7^ 

&(D^^7 — 7^ — 7 • ^-<—^k, yjtr — ^yY&ti 
ft K^r^^y hCO^l^-^^yut'a- (preview) "T 
Z>tz#><D^ — y*L — 7s • ^lt, K^r 

^.^ >- h%ftim-rZ>l)\ K*r^y< >- h£r RTFZ7T 

^;viLtt-/t5^ RTF^ ^ L 

■ 7^-7%mm-tz>o £i\ y^mm^r 

[0 0 7 7] gJ9 AlHf**^:/ - :7;i— 7 /M« 

s£^e— \t<otLito<Dify7 4 ^/ ^ • n.— if • ># — y=. 

^-7x-X 4 3 li, ^7 ■ ^ y 7 <DMcV)<n1& 

^4 3<h, •^^^4*3<tt^=¥— K3K«fcoT (JS^ 30 

[0 0 7 8] 19 AmX^£tl&£?{C^ 7yy^y7 
• — if * <^ 1/ 7 — 7x — 7 A 3J3\ ^^infe^fe*!* 
0 £TN£>ISI"C\ *fti*7*—' ^y h-f-^t K^ra^ > 

y hSrffi3£"C^S, 7^-^K4 4^J;(;4 6/)^4 9 
^•e^tfo y A — ^ K 4 4 frhba£>'& k , ^ — ifte, 
fy^-f • y*— -?y7lz.£ V fam-t^^ ^hl^<o^<—-J 
<£>URLT Y\^7 (e.g. , http://www.cis.canon.eom/tis/t 
is home. htm) &At}i~Z> 0 — if ^URLT K U7 & At) 40 
i-S(cii^<o^<7>*ife^*)5o ^--iffi. (1) 7K 

(2) • ^7 ^ift-URLT K^^= It, 

(3) URLT K^*?x^ - y^if/^b, ?y7 4 
y7 . rr — if . 4^7 — 7^ — ^4 3^, ^fcfi^^^ 

(4) ^l/yhUE^y5 4«:^y^^t5o 

[0 0 7 9] hURL*^>5 4l:ILT, ^--if 

hURL^^>5 4£r^ y s/^fSi:, $=-7- so 



"C\ • ^7 ^ift-. li.ffico ^^^(D^ — i/<DT K 

x^-^7 Kl/^^^xy • 7 y 7 (C^t L 

7K^-7-f — /U K 4 4 KB < o URL^K ^ > (Current 
URL) 5 4^fifb£n, ^C0^znyy-7T>-+f^gy^>t 

[0 0 8 0] l^Hltc^-rj: o [c, - ^7ni/'^^ 
^5 6t£. ^r-^Vir/V(Cancel)^^>5 7 4oJ:t>*^7^ 
if • 7 $ — h (Launch Browser) # :^ V 5 9 Sr^tf 0 
yt;^'^y5 7 it, URLT ^Current URL^ ^ 

V 5 4 £^ LTURLT H>^7^ — ;V K 4 4 ^A^/^-S 
<t 5 \^k 5 if wS*^^r-¥ fc^-trfe 
S 0 :^lty7 ^if • h • t£7 y5 9tt, 

• 77 >>if «r$tr3f^w«*(- J: «9 ifcot*«itf) 

[0081] ->x7' - yur— ^ y 7 (D^mmmmx 

yy^i j\s%> ^rURLT K U7 • y ^ —/u K 4 4 A^J 

itt)j:v\ tzk ^^tibm^m^mmnmn, ^ 

y^-— y b\cy*— ^y b Lfci/^(c(i. ^x— if(i7 
r^/^*4rURLT K^7 -r — yu K4 4 tcA^-T^o ^ 

^^^^mkmm<Djj^x^(Dyr^ ju^^mi^x, 

(C^7^- — -vy b-t& 0 

[0 0 8 2] ^77^7^' ^-—if • 4 y^y-7x-7 
4 3(cf^^M-r<Jr, ^ h/u- 7 ^— /u K (Title) 4 6 
11, ji— if ^7^-— 7-^ H^^xfc K^r^y > h<D/i#) 

f@Affl^-r k^^a^jt^s «t pK-rSo ^^^-r 
f+ttsr t h^mxhz> 0 7t^^f>fy^w7^ 

— ^K(Styles. Columns, Spacing) 4 7 4 9 fij, ^ 

hco^^-— ^/ h&fe^irZ> 0 \&<D7 y^r A 

my ^—/I'b'&miR-t&^tZ, ^^d-;w<-5 5^J; 

5*. "tfri^tKDy*— yy~>< ^py^—tv KiSRffl 
(D^^n-;w^$:^ y C^(cJ:<9r^ir^$ 

[0 0 8 3] *^-OKStyles)7^— ;UK4 7 fiffl^ K 

^^^t"5c wtbb<7>^^-f /Hi, 'h^ry^-r >-^(hea 
ders), ^ a (margins) 3£<7)A# S^*D#, ttl* K^r=.p« 
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(i, ^>^r>tf"7 y — (contemporary) , 7^- — ^/l/(For 
mal), 7r >{Fun)io£X$Zfv 7^ -y > 3 (Profess i 

[0 0 8 4] (Column) 37 ^ — yu K4 8 fc7* — * 

2o<7)n7 J t,(7)^>3y^ljfflt^, gp-^^i— ^ ^ A 
(Single)^o e tU ; ^fi^7^(Multiple)-Cfc5 0 L^L/£ 10 
^^^^(ir^2fStD^^->3 ^tdP£££;ft,&^ 0 * 

- 3^ AGDT^td^ — ?y h~tZ>o Zthi^ML. 
[0 0 8 5 ] 7-^ — i/'l/if (Spacing) ^7 — K4 9 I* 

hsn^ub^j K^a.^ >• b<Dffw\&m»ir 20 
L^L»t^> 3 ^t^iiii^ id 

(Condensed), / —?JV (Normal) , *5.fct*R^g^WllH 
(Easy-to-read)X^oT. ^(£> 9 *>JE«i(±ft/J^OtTW, 

[0086] #77 4 y9 • — if • ^fy^7x-^ 
4 3tttt^Ufa- (Preview) tK^V 6 0 

^--iffit^— ^ r 6 1 -e, 30 

[0 0 8 7] ^9 B0-e^$tt5J: 3 Id, £ 

• 7*—vy?&$LlZ*7+i/3 V (Options) ^ V 
6 1 £-afr 0 ^~^>3 >- • /^^>6 1 -7^-— h 

• 7*—^y?{c£*)&m£thX7* — -^y h Ztltz K 

^^y5/^«;^i:j;oT, r^f^yct^rt^ 40 

[0 0 8 8] ^9 B0t?^*ttS«t ?(d N ;^;A>3 V • 
^7o^-#y^762ll —« (General) ^"^v- a 
>64, (Container) ^-^v- 3 V6 6, ioj;!* 

* h y • * ?mWi (Strip Meta Info^z/v-s > 6 

ext Only) J i:^ j h^y^^ 7 2, >?p*| y 

>-?<D4 >"ry?7 (Index of links in the page) J £ 
1^5 y * h^ry^* 7 3 , 3oJ;U\ rj*ibli4SU(No fl so 
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oating pictures)] <tV^ D ^ h^"^^^ 7 4 &"&tfo 
m^Wt^gyil #^coy .X btfy ? 7[Z$o\,^X 

I"— J&J ^-^v-3 >6 4<D3rzfl/a ><D 2f@JL^±^fRlNp 

[0 0 8 9] r^** h<D^(Tcxt Only)j y ^ h ^> 

i/^T, ^d tdfc*? 1 ** hfc*ft*HiB«-rs J: 5 fd, ^ 

y 2 7 (Index of links inthe page) J y^ h ^ y ^ 
^7 3(1 $^7* - 7* — ^y^CMLt, !>x^ 

V hcO*mt^Px.$-^5o URL^ 

w^i^etii^M^T^^n^o r^ibijM l (No 

floating pictures)] }) * htf y # X 7 4 te* ^=^7- 

[0 0 9 0] 7 h y j/^- ^^ft#(Strip Meta Info) 
^>aV6 7 ■ ^ ^^tdJ: D^lS^ 

3 

(1) rtet(None)j 3i^(73^— ^J&^fpJfcl^^L 

(2) l ODTk^p/U— /U^-e(Til1 the first horizo 
ntal rule)j : ^feSft$ttfc» 1 *5 J:t>^ 2 ^tK^CO 

(3) l (D"r*^ h^T(Till the first tex 

t)j : ^7^7<D^<— isX*$kW<Ote£XfiMc%k<D-7 L *x b& 

[009 1 ] 7hy y /.^ ^if ^(Strip Meta Info) 
*7*\s 3 v 6 7 (i— Sid i ofc*ttS«-t-S d t ;&s-c# 

t^y 3 >fd|»JK-r td^Srf+i-r t id «t o 
TfeTj^^tL^o (container) t7"'> 3 >66 

ii, B9BBild^Six"C^5J:5ld x K^a^yh^ 

6 6^RWJditS:*,, 3yft7 6^K«S*i5o 
[0 0 9 2] 3yft7 6(iM^tlf:K^^^>h^) 

K^a>> b(DT KU^(i^>V L ^7 eidii^P^^-So 
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mu^^^-r 7-6 URLcOx-^^ 
5^7* . 7*— *y*\c£ V&mZti&MFF&MM't 

MCtt&h, ttKDT^ ^>itm^-7 7 [Z&V)7F£inZ> 

[0 0 9 3] ^x — if^^^x"^ 7 60T^3^^ y s, 

7 7i*5{@GD;*:7 P >'3 >\ gpt>, r^<(0pen)J 79, 
r^t^i-5(empty)J 8 0, > hlT5 (Print) J 8 

1. rnHi-S(Edit)j 8 riSlg-r5(Save)j 10 

[0 0 9 4] rBB<(0pen)J 7 9 teT f >f ^ Id $ H5 
^9 BUtD^Vy 1 ^- ■ ^Vt 1 ^ b (Container Conte 

nts)^.^ y — >8 7 ^ij^tz^^-r^o =* >^r-r • =* >- 

f>F (Container Contents) ^ ^ !J-y8 7(t 4 M<D 
1P*>, ^ffiOURL^:^>^^7 6(^D^.S TURL 
il^JD (Add Current URL) J /f^V8 8^ — if ^= V 

r^y— h (Delete) J ^^89^ ^— if ^ ^ 

o, if^n^^-^-n^^^ - ^^y-y8 

7 SrK^ri^-CtSJ; 5 (Done) tf? V 9 

^^-7 (empty) J 80^^^^f5 

[0 0 9 5] Zh^^l/^-ri 6 tC^^^URLt^Jil 

SftSURLI^dr*: K7 s^L Ko ^/t^d £ so 

3££>W»tc:J; V , — ifii^^^-^7 6tC^5URLcOM 
SJSffSr-ttt-en^H-TSr t^-C#5o ^y>h(Pri 
nt)7j?^>8 1 , iHft(Edit)^^^8 2 do J: t/i? — y (Sa 
ve)3tf*>-8 4!*, T^^^y^n^B^, 3yft7 6 
S URLlc J: <9 ^» $ £ 9 ^ 7*<D^— i?X*± 

it, ^fbO^— ^^^--ifoo^^ilD (-^^— -^;y h 

TF^r-r^^iiM^it, ^ttt-^ts^ iBft-rs 

[0 0 9 6] t/y a> - ^7P^-^^^^62t: 

^^8180 (Print table of contents) J !JX h \y $ 7. 
9 2t rfeSai-^(wi-5(Empty after processing)] 
h7)f^^^9 4£<&£j> 0 WtW<D£o\Z^ V7btfy 



£1^5 w tSr^-r^^Jcffi^tiSo ^<£>,£T\ 21BW± 
<D})7 hjJ?s/^^^-Si:i«snSwt^-e^5 0 

Tf^^^^ — yyUSlJffilJ (Print table of contents) J y 
^ htf>^*9 2^SR*n5i:, nvx-^7 6C:fc5 
±"C<E>URLO>**f h^*7t- h Sixfcttl* K^a 
^ vhrortSy;* bt LTHiB'Ji-SJ: 
^— ry^C^f^c r«L3l«^ffi(C-f5 (Empty aft 
er processing) J y^ h # 3/ ^ >^ 9 4 tfiT # *r 4 ~7 \Zt£ 

^9t5co^/:f ^ t>, ^mm^^>^^ 7 6 

[0 0 9 7 ] 3yrW^3y6 2ro-M^U, 
rRTF^V (Select RTF Editor)j 7$#>6 

9, =3r -V>-ir7U (Cancel) aK^^ 7 0, *5 tM*— ^r — (0 
K)^^>7Htf:^ttf)ixt^6 e ' Trtf^^V^co 
51^ (Select RTF Editor)] ^ ^ > 6 9 JJ s/^i"5 
r<tl:J:ot, 3.— iffiRTF7r-<^ • 31 -r^ ^ (-^^> 

/ich:x.[i\ RTF^^V^ (^FIU^) ^r-^^^t" 

t^tcortf^^V ? <o 0 %<d im&m#l~tz> :^:J;o 

T, /^tl5o ^r-v^ir/UT^^^ 7 Ofi^V^^^-yv- 

[ 0 0 9 8] ^9 Am^^ZtlXl^&tiK y^y-iyt 
•a-f .>f^-7i-^4 3(t ^fc rHlJSlJ(Prin 
t)j 7^^y96, r^I^(Edit)j T-T^>9 7, Tir 
— y (Save) J 7^3^99, r^/U^(Help)j 7$<? > 1 
0 0, r#|.T(Done)j ^>101, *5j:t>* rft/Mt(M 
inimizing)j 7^=iyi0 2^?> o iffl^">^^* 

[0 0 9 9] n^ffilJ(Print)j T-T = V 9 6 

7 P yyh'^7o^-^>^^ (^FUl^) ^r^<o 
T^Ift(Edit)J T^^>-9 7{i, 7^— bStifci? 

rir — ^ (Save) J T-f=i>9 9(i, ^— ^7 ^~ ^ 
b&ti1Z$^7<D^*-i?\c&Wl&tftts RTF 7r^^i: 

(^Fia^) ^r^< 0 r^u^(Hei P )j ^yioo 

* ^-fe — ^SrSttL, -tLT f^T(Done)J $7^10 
-7 y $ & y biTZ> 0 r*/h 

fk(Minimizing)j T>f a V 1 0 2 fi, ttTt^P>i:»t 
<n&W£tlZ>±.&<Dy'^7* • 7^-^y^^ft/Wt^- 
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[0100] ^9Ciii, ^^-f • y*—- vy*<ovm 

tlh(D/ — fl, ^T^^(File)^-^.— 1 0 3, ^ 
T-V y h(Edit)^=a.— l 0 4, ^o^t^^-T V K£(Win 
dow)^^^-— 1 0 6&^tf 0 77-^/M^a-l03O 
ir-y(Save)T^ =3>-9 9, HH (Edit) T>T = 
V9 7*5<tU«HiaiJ(Print)T>r = ^9 6 COttlfigli*^ 

*i— SI" Ir-:/, ISM. &&Xf$\m<D*yr*s 

a y^IiW6 0 4if*/y h(Exit)^^v-a >fc£fc:7 10 

ami-^T^^(File)^ =.3.— 10 311 1" HTML 7 T ^ 
£BB< (Open HTML File) J t^'>3 V10 7 SrSBW 
5 0 ^tf^^v-s >H, P-^J/1/HTML77^/K IP 
s> h * ^— IT* (NetScape) a* <b -tr — y^t77-<^ 

f>f777'f^S:H< £\ *fcliURL£r9-f >- K9X<Z) 

5o rHTML7^yU^M< (Open HTML file) J t/Va V 20 
1 0 7(i*fc, ftHco^^^^z-r^^Mp R p(-J: 19 #M 
^ti^r^/L^, -)x7 l -7t-7^^^Lt, 
RTF^x-f/^ LT7t-7y h joi^WJ-T^Z 
ir - ^ J: XfMM , tfcfi^iftb^tsr^^t 

[0101] (Edit) J 10411 TUR 

LO^— h (Paste URL) J fr^is 3^109 
3 C ^CO rURLc^-<— ^ h (Paste URL) J Xy+x/a X10 
9(1, $nL7<D*<—\?frt>^ ^ix/cURLT 30 
C0-<— J* h/<y77<Ortt^ ±fSURL^ K4 4 

BB-rSitftSr^-— ifCfiitS r^u^(Help Topic 
s)J ^v's • *y9<T>WM%&& 

OPT (About WebFormatter)J d~7is a > k 

£ e ^> (Window) — 1 0 6(1**: r/^7 

T (Preferences) J ^"^^a^l 1 0 ^^tfo - 40 

<7> r^U77 (Preferences) J 11011, |9D 

1 2£Hf]<o 

[0 10 2] /1/77^^'WTn^-^^^l 

I2fl, ^^rf - y*— 7^^S:*fltftLfc5^liS« 
u^r L">* - ^7p^-^^^7;i i2d, r* = 

■e^X • tr^L — (Minimize View) J jr^z/a >1 1 3, 

r— ^ (General) J >1 14, *3«fct/ l"{£fflWWW 

X^£if(WWW Browser to use)J 3 V 1 1 5 50 



tf 0 T^^-v^X- t*^. — (Minimize View)j ^-^*>3 
VI 1311, • y*—-?y#<Difyy<<( • =3- 

tl^, »1C0»I1 rTOlJ(p rint )j , rjlH (Ed it) J 
iolT^ rir-^(Save)J t £^tr 0 - tUbiD^iy a > 
fl, Blgj^^y V h (Print) T-f =">9 6, 
y h(Edtt)T>1' = >'9 7, *3 J: t>'ir — ^ (Save) T <4 =1 V 

Zf*sa><DtLlto(DT'< it k V V V T 4 =a 

v v zsL*f4 y hT4 3 >\ *5<£U\ ^fcflir — 77^^ 

^COft^Jl 1 6 £^i" 0 
[0 10 3] ^9DIE]t^m5^:, r^-7>fX • t^L — 
(Minimize View) J ^^v-s^l 1 3J1, &tz, ^ff{Ko 
w)j t ^/^ (Stack)j (Dt/v/gyt)^ 

-^ly^^^T, rff(Row)J ^il^LTTK^Ft-. 

fi y ^ (stack) j ^awursaics^i-s X 0 
t:Kt6t^9^7 7^ 3/^ • 3.— if - -r ^z — y^. 

niBttca^-rs^^^-c^^ Mteo^j^Lr, ^77 

^ y? • — if ■ >f ^zn — ^ 1 1 6 (1, — ^IJOT 

[0104] rffi^VIWW^^ ^ if (WWW Browser to us 
e)J >-l 1 5{1, * 7^-— 

^^Oct 0 tC, Netscape, InternetExplorerio <t TJ^Mosaic 

ti^^^if ■ Jr-?*/*^k ^x&m^mmzftxi^ 

y ^r^if— . ^-y-> 3 VflNetscape Navigator ~C&> 
^0 

[0 10 5] r — (General) J VI 14 11, 

^if ^#fct^^^— h (Auto-startwithbrowser) J 
^->^^>3 yil7, rSr/J^"C^*— ^>-(0pen in minim 
ized view)J Tk-fi/alsl 1 8, ^£X_k(Dft\M 

(OfiiJt^ig^ (warn before printing more than page 

s)j ^"^v-aVl 19, &&Tf* r_MBs^_h^TOJO 
MtC^^ (Warn before saving more than MBs) j 
v-sVl 2 0k%^ts o t^yWtmcx?— h(Aut 
o-startwi thbrowser) J ^3^1 1 7 11, • 
-fyV-Fffi. Ty-y^^ycokZ^^ V^-y*— 
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5o rg/J^-C^-— ^>(0pen in minimized view) J $T 

^> 3 i s&mvi&tizt* ft/Hb^e— K-e^a:^ 

am before printing more than pages) J j ^"7° is 

3 VI 1 9:fcJ;t>* rMBsWi^PPJBflOHfriC*^ (Warn be 
fore saving more than MBs)j t/'Va VI 2 0 II, 
y*—?y h £tl1t K*~* V h£>-fe — :/£;fx£^— V s 

^^ey^^- 7oo4£ x 3-— »f^»Jl9-e#s«t 5 

5 e lii^^i^tyya XDtzHbtD-ry^rJU 

eneraO^v-a VcD 2 ffl£JLhaS|*)B$(Cji« £ th& Z\ £ 
^T*# 5<Z3|iW9 £-et>*l/\> 20 
[0106] ^1/77 I/y^ • ^7o^ • /tf;y^7 1 
I2f4*t ^-if^il^Lfc^u^y L^v^*Bt!9?H 
i"^rir Vir/b (Cancel) ^V 1 2 l*5j;t^ if^il 
^Lf:/i/77 ^VJXfcflKBt-S;*-— (0K)#* V l 
2 2 ±<Dmn<D£ 9x7-7^- ^ ;y ^ 

^l/77^'^7P^-^^^^112S:i 

fc££^:/ ■ #<r>t-\h<r># ? y 4 y$ - 

•f • 4 >?^y l 1 6<D—i¥\\&-r--t 0 zcoyyy 

4 y $ . if . V;? — — 7 1 1 6 !f tf* 
^^i^SrfliggLTt^Wtc:. l*»(floatingK ^ $ ~y 
^ — 7 ^ LTt/T^^, t£oT^^ if^^/O^-- 

^>^-7x-7ii6^^:^i:/^ 0 ^770 

^ - a-f .^y^-7x-^l 1 6 (^77>f^' 

tf) oiSMr^nySr^y^^tsriiciot, ^ 

— tmsffi^9^<0^— bx.T\ ML, 
-tU-C-t^RTF^r^^Srir— ^L, £11**5 J; tf£ fell 

[0 10 7] ^ — if{3\ M%t£i>$X7$?>$:jf7frjr 

7t- ^^^^rTO^j-r^r «h^-c#5 0 ~(omn^<k so 



?K /U77U^'^7P^-.f7^^n 2(C— 
St6J:3^7 p U77 l^V7 • #4T*# - 

• 7*- ^y^co«diSr*aa!9twaM-rsr t^r*# 

5c t>L^-— tf;^ ft/Wt ; e-K^e)7/u««*-Kl: 
A*Lfc^»^(-«, 3-— »9 El2tf>»*{fc(Max 
imizing)T-i =i V 1 1 7 £^ y y ^ -f S^S^fc 5 f£ ft 

[0 10 8] mi OEIfi*^ • 7;*— ^ ^0)^28 £ 

a^t57D-f+-hf*>5o r>^y-^^— 

(i^ry^s 10 0 otei^tiS. JtiEOJ: & 

7])y?1-&Zb\z£<>X?f?l\btfX$Z> 0 $^7- 
y*—*y91M£<0& o {^m^.it^titz<Df)\ IP^7;U 

mm*- Ktfc^^\ * fcfctmtf*/MtrE— k-cs> 

—7 4 3(cS(ao^7 7>r ^ • if • ^>^"7x 

— 7tK £^(3^7^ ;y ^ • if • ^ V^ — 7:xi — 

711 6 idJSffi ^7 :7>r ^ • 3.— if . ^fy^-7x 

— 7^cD }£%> bfrffi*^ yzfS 1 0 0 0-C*^$ttS o 
£JLTT?[*, ■ 7^-— -ry#<D-7 t yx— A- K^e— K 
li7/UiS6^- Kt?&S^:i6t-, #yy<<y9-=L—*? 

— if • ^>^-7x-7^^fy7°S 10 0 Ot'S/T 

[0109] ^i:^f77 p siooir% -yzsL-y.y* 

7 112, jait^^^v-g y-^7P^-#^76 

7t-77^Co^tO±EfM^llf:< 

2T% K#3.7 V h • y*— ^ y V • -T— ^(i. ±IE7 
>f-^K44t 4 6- 4 9l:A*Sn6o J: 9 ftffcWlw 
#9^, ^l— URL (&ft\±y T 4 frZi) ^URL7^ 
^K4 4^AM5o WTt-R^SixS^, £^;7 

^HfJia«J**bWC7>f— 6~4 9{CA^^tifc 

[oiio] ^r?'7'sioo3t' > tfj^(Dr>^^ * y 

xy- 11-^77^/5 1 0 0 4^ !7— /U K • 17 >f 
K - [>x7^^y h7-^[lggf^) 0 SlI7r^ 
S 1 0 0 5T% URL^fcfi^r^/^^A^^tl^^if 

* <o&&temt&j&mx*iL wLtztf %A£i-tz> i\thx 

#5o LfrLtetfh. y-^y* - y*— iy 9 <om<OMM 
BMl*yT^^%<D^>hV~$:'Zjm^1-Z><DX\ URL 

r K \^7{zh^>yr4 ^h\%ft\<Dy r 4 ^**&m.-tZ>-)5 
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[0111] URL^>W— ;u K4 4 WSEKA* Sti/Cl* 
5»-&f±, ftySli^x^l/S 1 0 0 6^itti\, y'ry? 
S 1 0 0 6T-ii^^y ■ y— URLT Kl^iCj;^ 

5 0 Hof)x^ - 37;*— ^y$X% £^>^£>URLT K 
u-;*S:S«U f ©7 K ^ =i > f t 7 6 leftttt 

ti. n yf t 7 6 07 K i/^lcfttt SixSf-^ Sr, ^ 

<£>^(textonly)j t/v^a > 7 2 ^fft>ttTl & 
^vu- Y£thtz.7 £ >-9frt>T** Vtmfim&Zth 20 
5 0 ^Ifi^/^b^^zy^S 10 1 l^iltPo 

[0112] r^fi^&oT, ^T^S1005 
T\ HTMLV — *y T ^ ^<Otz.tf)(Dy r 4 ^%>ti*X?)£th 

T-f/i-tD^ l co-^-f h(rr^ir^-r^) J: 9t-^h^-r^ 0 

^f^/sioo8ioj:t;sioo9t% -t^iM 
«»f$tLr, x-^^tttu^tv, ii^o^^^^s i o 
o 7 twicjjfeX'&mztx&o Wb^fy/SlO 

^HTMLV — ^7 r^MZ. V * V $*L5T1^£a»i? 5 30 
StSt^o *<Dyr'fMz£t>izg><<Dy->{hfryxb 
£tiXh^Z>m&^fc, 7n-|j:7r^S 1 0 0 8f:I 

v^t^fcfi, ^LSfi^s^S 10 1 lt-iitfo 

[01 13] ^75//S101 If, r>x7~. ^ 

— K4 8^S^fi(multiple)t^RS^tlTV^4i'&fix 

v hco^fd^^-— ^.y N£ti&r bfoftZo 40 
ft, V h(Print)T>f = V9 6, J^-f s> h (Edi t) 

7^3V97, ^fctiir — ^ '(Save) T>f 9 9 (Do ^ 

[0 1 14] 3X^^-1*3 tCURL35S»*ftSnTl^ K^ra 



-^y h*jJ:tf»filcl:fflot7*-7 3' hSix, y"ry 
ys 1 0 1 1 T*RTF7r-f/H-SE«IStLSt, ^^RTF7 
T-Y/M^T^.y^S 10 1 2T'tB^)^^5o BUCO^^X 
I3\ ^77^ y * • 3- — if • -y — ^ot°cor 
-Y ^^^eStj^^tTl^fcTiMCfifeoT. ^(DRTF^r-f^ 

[0115] *&wtem^\c£&&fc<Dmmmmizm^ 

t7p yr4 twy-^-ftv 

[Defaults] 
Count=4 

T i 1 1 e=MyDa i 1 yPaper 

in 

Head i ng=News I nBr i e f 
Site=l 

Sect i on=Fron tPage 

MaxLevels=5 

Maxpages=10 

MaxKBytes=2000 

Date=today 

Print=levelO 

Tempi ate=l 

[2] 

Head i ng=Spor ts I nBr i ef 
Site=2 

Sect i on=Spor ts 
MaxLeve 1 s=0 
MaxPages=10 
MaxKbytes=200 

KeywordFi 1 ter="Footbal 1 "AND"49ers" 

Date=today 

Print=levelO 

Template=l 

[3] 

Head i ng=MoneyMatters 
Site=l 

Sect i on=Bus i ness 
MaxLeve ls=l 
MaxPages=100 
MaxKBytes=20000 

KeywordFi 1 ter=" Computer "OR "hardware "OR "Software" 

Date=today 

Print=all 
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Template=2 
[4] 

Head i ng=Sr i Lanka 
Site=3 

Sect i on=Ho tNews 
MaxLeve 1 s= I 
MaxPages=100 
MaxKBytes=20000 
Date=today 
Print=leaves 
Template=2 

#Legend : 

#%W-dayoftheweek 
#%s-sectionpartofURL 
[Defaults] 
Count=3 
[1] 

Ti t le=SanJoseMercuryNews 
Username=mwi ckram 
Password=cannon 
StartData=StartHeadl ines 
EndData=EndHeadl ines 
HomePage=http : //www . sj mercury . com/ 
Sect i onURL=http : //www . sj mercury . cora/%S . htm 
SectionCount=9 
Sect i on l=FrontPage 
Sect i on2=Internat i onal 
Sect i on3=Nat i ona 1 
Section4=Local &State 
Section5=Edi torialsCommentary 
Sec t i on6=Bus i ness 
Section7=Sports 
Sec t i on8=L i v i ng 
Sect ion9=Enterta inment 
[1 , Sect ions] 
FrontPage= front 
Internationa l=intl 
National=natl 
Local &State=loc 
Ed i tor i a 1 s&Commentary^ed i t 
Bus i ness=b i z 
Sports=spts 
L i v i ng= 1 i v 
entertainment=ent 
[2] 

T i 1 1 e=TheSanFranc i scoChron i c 1 e 
HomePage=http : //www . sfgate . com/clironicle/ 
Sect i onURL=" ht tp : www . sfga te . com/ c i g-b i n/chron i c 1 e/ 
art i cl e- 1 i st . cgi ?%/S : /chron i cl e/ today " 



SectionCount=5 
Sect i on 1 -News 
Sect i on2=Bus i ness 
Section3=Sports 
Sect i on4=Ed i tor i a 1 
Sec t i on5=Datebook 
1 2 .Sect ions) 
News=News : MN 
Bus i ness=Bus i ness : BU 
10 Sports=sports :SP 

Ed i tor i a 1 -Ed i tor i a 1 : ED 
Datebook=Datebook : DD 
[3] 

T i 1 1 e=TheDayNews 
Homepage= 

http: //www. landa .net/lakehouse/anclWeb/dai lynew/ 

Sec t i onURL= "http:// www . 1 anka . ne t / 1 akehouse/ anc 1 Web 

/dailynew/%W/WS.html" 

SectionCount=12 
20 Sect i on l=Busi ness 

Sec t i on2=Ed i tor i a 1 

Section3=Features 

Sect i on4 =Fore i gn 

Sect i on5=Letters 

Sect i on6= I nBr i e f 

Sect i on7=IIotNews 

Sect i on8=Probes 

Section9=Mi 1 i tary 

Sect i on 1 0=Po 1 i t i cs 
30 Sect i on 1 l=0bi tuaries 

Sect i on 1 2=Spor ts 

[3 .Sect ions] 

Bus i ness=bus i ness/ i n t ro 
Ed i tor i a 1 =ed i tor i a 1 /f i na 1 
Features=features/intro 
Fore ign^f ore ign/ intro 
Le t ters= 1 e t ters/ f i na 1 
InBrief=inbrief/ intro 
HotNews=hotnews/ i ntro 
40 Probes=proves/ intro 

Mi 1 itary=mi i itary/ intro 
Pol itics=pol i tics/ intro 
Ob i tuar i es^ob i tu ra i / i n t ro 
Sports=sports/ i ntro 

^- — ^(Dfzisb^M^^^^ : MaxLevels=<#> 
-1 : ±X<DU^</\-^^mi-^ 
0-n : n l"<^*"el&Si"5 
50 K^a^ > HcOg^:^— is : MaxPages=<#> 



(23) 

43 

n: n^— ^&j8x.*^ft«0> K*a> V N 
K**> >- h^i^X : MaxKBytes=<#> 
n : n^D/M h SrjBx_ft^*1££> K^a.^ > h 

Date=today| lessthan<#> 

today : ^ 0 ffl^ £ ftfcE#<Z>^SrSI3M"5 

lessthan<#>;n : n 0 J^_hSiii L T I 

Retr i eve=a 1 1 | nosubd i r | noth i sd i r | th i ss i teon 1 y 

all : tecOi^-r hfrb<Ds<— v^SrBl 0 HH"r £ &Wir 10 

nosubdir : y-f*? 4 ^2 b V — WURLSrBfe^ 

notliisdir : Z<Df* 4 I/? b V — <£>URL£:|5&^ 

thissiteonly : ^CDIfW h it^^" v^&o T < 

KeywordF i 1 ter=<keyword> (AND | OR | NOT) <keyword> : 

KeywordRank=<#> ; n : 

7 7 l»S£ffl^TKeywordFi lterf*|CO^ — !7 — K<£> 20 

KeywordAuthor=<author> : 

ExcludeType=ads | nonEngl ish 

ads : JEft£fi&*M-5 

nonEngl ish : 3£»T*ft^iE*£Bfc^^S 

^piUfc/U— : Print=all | leaves | level =<#> 

all : -^ill^ K^r^^ V hrt<7?^ U — (CfcS^T<7?/ 

— K^r-atf)^ 30 
leaves : -troiKWK K^r^ >- r-f*3CO^ I) — {Zfo%^X<D 

level=<#>;n : ^<DBM K^t^> V hrt<D^ y — n # g 

7^- — -^^ YfcjV—jV : Tempi at e=<#> 

n : "fy*^ b h L< te^ — if "r >:X h#-^n \zM 

ftB3A 40 
(fllAffl — 3. — ^ • :Xn yy^)V • xfV ^ ■ ^E-v^ — 

yU^COT^ir^^^cffllL, CProf ileMgr^ ^^tCjzoT 
BOOLNewProf i le(CStringf i leName) ; 

- fil enamel <fc D 4^. bnSSf jfi^n 77^ ^**J 
BOOLOpenProfileQ ; 50 
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- T7t^hWyP77-f^^t-7 P >'t5 
BOOLOpenProfi le(CStringf i leName) ; 

CProf i leEntry*GetFirstEntry () ; 

CProfileEntry*GetNextEntry() ; 

, ^CD^o ^T-f/W^^^ h y £p — KLTiS-f 

BOOLWriteEntry (CProf i leEntry&entry) ; 

#*£>yfo 37 7 W/U^V h y liCProfileEntry^ 7^1- 

(* : 

CURLGetSiteldO ; 

- ^(Dzfay 7 ^/u - xyl> y tcgr^^lfW MD£rig 

■r 

CExtractionSpecGetEx tract ionSpec () ; 

- ^(Dyuyr^/u . ^> h y ic^nsttiiitt^ig 
COutputSpecGetOutputSpec() ; 

<r>zcy. y— ^ . ^o^— CWebPageCQ^ T^fi, 
h^^^if— row ^?y^ — *£^ftU 

B00LLoad() ; 

- URL, if^., /^^-K^i^t^xX^-^/^ 
&oT< 5 

BOOLParseO ; 

CURLList*GetLinks() ; 

- aK^i^— ^rtroy >^ • y * h^r^-r 

CPageData*GetData() ; 
voidFi IterContent () ; 

i-s 

CStringGetTitleO ; 

- ifW hco^-^ic^oT^W h/u&O^flferofllraSrili- 
CStringGetAuthor () ; 

intGetSize() ; 

- ; eroT f — ^WX^^r p^^W hT^i" 
CNetwork^^^tiOLEBifiBSrrtQSi*:, W>^-^>^ N 
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CS t r i ngGet User name () ; 
voidSetUsername(LPCTSTR) ; 

- ^CNetwork^r/v^^ hfo\Z&&<D=.*—9 s £i:±<y 
CStringGetPassword () ; 
voidSetPassword(LPCTSTR) ; 

- ^(OCNetwork^^v^^ h f^fPI&tfV^ !7 — K &:1r 10 
voidC1ose() ; 

- ±X<DT^^y"^m^W v ), -^CNetwork^z/v^ 
shortRead (BSTR*pBuf fer , short i Amount) ; 
longGetStatus() ; 

BOOLOpen (LPCTSTRpURL , short iMethod , LPCTSTRpPostDat 

a, longlPostDataSice.LPCTSTRpPostHeaders) ; 20 

CStringGetErrorMessage () ; 

shortGetServerStatus() ; 

longGetContentType() ; 

CStr ingGetContentEncoding() ; 

CStr ingGetExpi res () ; 30 

- rcon- Kid J: D^^^^x^i^-^^S-P-W^iT-^^ 

CS tr i ngReso 1 ve (LPCTSTRpBase , LPCTSTRpRe 1 a t i ve) ; 
BOOL IsFini shed 0 ; 
shortBytesReady() ; 

- xf>f ^(Ciilt^ f3 , CSiteDriver^ 7 X 
BOOLNewProf i l e (CStr i ngf i leName) ; 
5 

BOOLOpenProfileO ; 

BOOLOPenProf i le(CStringf i leName) ; 

- ^%&W\<D7 p v7t4 ^%$-—Zf>-fZ> 50 



CSiteProfile*GetFirstSite() ; 

- ^&^<D^ htD^^bV^^—h^xmri- 
BOOLWriteEntry(CSiteProfile&entry) ; 
intNumberOfSitesO ; 

^4Wt^©(9«^, CSiteEntry^ 7*l£«fc 9 * 

CStringGetURL() ; 

- SRt^ hcDSSgURL^igi" 
CStringGetUsername() ; 

CStringGetPassword() ; 

CStringGetTitle() ; 

CStringGetTitle() ; 

intSect ionCount () ; 

nms B 

^-^y-hwy-T'fcs) ^t^t)tot% cp 

ageTreetC J: o X$k £tlZ> 0 CPageTreef*WWW2:«Wr b 

CPageTreeNode*GetRoot() ; 

- Hate^y— (DA— hy- K£rilf 
BOOLBui Id (CURLURL , CExtract ionSpec&spec) ; 

^<~i?y !) —fa<Dm* <D / — K(iCPageTreeNodetd<to 
BOOLAddChild(CWebPage*page) ; 
CWebPage*GetPage() ; 
intNumberOfChi ldren() ; 
BOOLIsLeafO ; 

i>nL:/<£>^— i/y y — S:«»ri-6i-li, CTreelterator 
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voidReset () ; 

CPageTreeNode*GetNextNode() ; 

- KSriSi" 

CPageTreeNode*GetNextSibl ing() ; 

CPageTreeNode*GetNextLeaf () ; 10 

^fflotrwy-SrSIBfU S» K ^ Mi, fcH 

tt5^yr-f ^7^77^!) ;* h^^co^^ — ^ 

s'hSrffl^T^f^HSo fcH* > Miy ry^ 20 

• h • 7*—?y h (RTF)"C*>9 . ^< (DT^V 

^-^ 3 y(:J;^7^t^pf^Tfc6 0 RTFlix^;* h 
td*fL"CitA/fc*7^— ^y h»fgT*fcl9, K^^yy 
h , ir^ v-3 v'tD^^— 73/ h^. T^Z-i 

o^Tfi-y-^— Mi&i\ -y— \ts<*-7 L <f<D : 7>(y7 y 

*tLTHTML77'f^4jSt5o *7^--7 3/^ 30 
teXFormatter^ Soffit* : 

BOOLOpenHTMLFile(CStringfileName) ; 

- tijMMM^tlf:HTML77^/^r^-7 P >'t 

voidCloseHTMLFile() ; 

- ^coHTML^r^r^^^ n— xur-tr— y-ra 

BOOLPrintHTML(CPageTree&root ,COutputSpec& format) ; 

- SK/u- h £W*tt«a*^;tfcn5 ir, ^<7>^y-$: 

40 

BOOLOpenRTFF i 1 e (CS t r i ngf i 1 eName) ; 
voidCloseRTFFile() ; 

- SKRTF7T>T^4r^ XLir— ~?-fZ> 
BOOLPrintRTF(CPageTree&root .COutputSpec&format) : 

BOOLPri nt (CPageTree&root , COutputSpecfc format) ; 



48 

[02] 01 TM^stisfflAffl^^— y^sst^*^ 

[i3 A] 2|s:3gWlCt*oTtiJ*0^»75Sffh)tLSIS<7? 
[0 3 B] 0 3 AI^Sn^^^iyd^tttBSnfc}* 

tHx-^ ■ ^y-co^^^-rm e 

[0 3 C] El 3 B 07 y - *^¥fiftMl$iife 
3. > > h <^«J*Sr^"f-0o 

[0 3 D] 0 3 C(D¥-mt V h^b7t-7 

s> h£*Lfc K^a^yh^m 
[04 ] ^^^SriilS-C^rL — ^|E*^r»*-rSfc«) 

[0 5 A] f^Affl-^- — * • 7 P P77^^ s ^ < t ? 
- ho 

[0 5 B] fiAffl-^ — ^ • ^vyyi fri£}£<D& b 

-ho 

[0 6] ^SHWKfiEoT, ^3.— ^E*^^^^b 
[07] =3.-^i5*^^<7?J;5^LT. fSAffl- ^- 

-7.^P77^^^MLT (7 p 077^^l:«*l 

~C) ^^^^Siai^nS^^SrK^-t-S^n— 5^^ — 

ho 

[08] = ^ — ^1E*35S^J: b IT, 

f@Affi^— ^ • yp77^/^ilLT7^— 7^ h 
/yyh«S>f>^-7x-^{^f,ttS^Sr 

[0 9 A] *5IM(75S 2 <©||J6ifgffii:#(ci£ffl £*x5 
^770^ - tf • ^ — :7:n — ^^JS< 0o 

[09 b] #mw<Dm2<Dmi&Bntma$imt$ftz> 
yyy^y? • 3.— y 2 • -f^-^-^SK 0 O 

[09 C] **WcoS2co^li6?Kffii:*^«fflSnS 
^77^;^ • ^.-if • y^-7x-7«< 0 O 

[0 9 D ] W cog! 2 wHtS^ffi <t £ 

tf'yy 4 • F • ^y^-7x^^jffi<i 0 

[09 E] *3SW^S2^J6?gffit#twffiffl$ix5 
7?y 4 yt • ^—^f • ^V^-^^-^^lf< 0 O 

[010] **W^»2^1i£?»ffiSrtt§Bi-5 7n- 
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1. TITLB OF INVBNTION 

WORLD WIDE WBB NEWS RETRIEVAL SYSTEM 

2. CLAIMS 

1. A method for formatting data from at least one hyperme 
dia document, comprising the steps of: 

an accessine step to access the at least one hypermedia do 

curoent; 

a reLrievins slep to retrieve data from the hypermedia doc 
uraent into an extracted data tree, wherein the data is retrieved tiased n 
n a structure of the hypermedia, document; 

a flattening step to flatten the extracted data tree into 
a linear document: and 

a formatting step to format the linear document inLo a for 
matted document. 

2. The method of Claim 1, further comprising the step of 
printing the formatted document. 

3. The method uf Claim 1, wherein said hypermedia docutnen 
t is located on the World foide Web. 

4. The method of Claim L wherein said hypermedia docuinen 
t is located on the Internet. 

5. The method of Claim i, wherein said hypermedia documen 
I is 1 filiated on an intranet. 

6. The method of Claim 1, wherein said accessing step, sa 
id retrieving step, said flattening step, and said formatting step are p 
erformed in accordance with a personal-news-profile. 

7. A method cf creating a personal-news-prDf ile for retri 
eving data from a hypermedia-linked computer network, comprising the ste 
P3 of : 
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accessing Lhe hypermedia-l inked computer nntwork; 
entering a learning mode; 

traversing sites on the hypermedia-linked computer network 
with commands; 

extracting at least one rule from the commands; and 
compiling the at least one rule into the personal-news-pro 

file. 

8. The method of Claim 7, wherein the at least one rule s 
peclf ies structural characteristics of sites for traversing the hypermed 
ia-1 inked computer network. 

9. The method of Claim 8, wherein the at least 
one rule also specifies content-based criteria for traversing the hyperm 
edia-l inked computer network. 

10. A personalization system for creating a personal izati 
on profile for a Web site retrieval data retrieval system, the personali 
zation system comprising: 

an input device for inputting data and commands to access 
the World Wide Web; 

a connection to Lhe World Wide W«h; 

a memory for storing a Web reader, the Web reader for acce 
ssing the World Wide fteb via the connection to the World Wide Web accord 
ing to commands from the personalization syslem; and 

a processor for launching the personalization system in re 
sponse to a user command, wherein the personalization system, upon being 
launched, (1) launches the Web reader, (2) accesses the World Wide Web 
via the Web reader. (3) enters a learning mode, (4) sends commands to th 
e Web reader to traverse the World Wide Web according to user commands, 
<5) extracts at least one rule from the user commands, {6) compiles Lhe 
at JeasL one rule into a personalization profile, and (7) stores the per 



(40) ftmW- 10-254912 

[g#= 3 6 P 1 1 1 J (3) 

sonalization profile. 

11. A method for retrieving articles from a hypermedia-1 i 
nked computer network and for formatting the articles into a personalize 
d newspaper, the method comprising the steps of: 

retrieving a stored personal-news-profile which comprises 
address data for a site on the hypermedia-linked computer network, comma 
nd data for accessing data from the site, and newspaper layout commands; 

contacting the site based on address data stored in the pe 
rsonal-news-prof ile; 

downloading articles from the site based on command data s 
tored in the personal -news-prof ile; 

flattening the articles into a linear document; and 

formatting the linear document into the personalized newsp 
aper according ta layout commands stored in the personal-news-prof i le. 

12. Tha mRthnri of Claim 11, furthp.r comprising thfi atfip n 
f printing the personalized newspaper. 

13. The method of Claim 11, wherein said hypermedia-1 inke 
d computer network is the World Wide Web. 

14 The method of Claim 11, wherein said hypermedia-1 inke 
d computer network is on the Internet. 

15. The method of Claim 11, wherein said hyperraedia-l inke 
d computer network is on an intranet. 

16. The method of Claim 11, wherein the command data for 
accessing data includes data for selecting articles based on a structure 

of the site. 

17. The method of Claim 16, wherein the command data for 
accessing data also includes data for selecting articles based on a cont 
ent of the articles. 

18. A World Wide Web site data retrieval system for acces 
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sing at least one Web site, for retrieving data from the Web site, and f 
or formatting the data into a personalized document, the system comprisi 
ng: 

an input device for inputting data and commands to access 
thfi Wnrlrl Wi rift Web; 

a memory for storing a Web site data retrieval driver whic 
h includes a Web reader, stored Web site address information, stored Web 

Gite commands, and stored format information, wherein the memory also i 
ncludes process steps to connect to a Web site and to issue commands wit 
hin the connected Web site; 

a connection to the Wurld Kida Well ; rind 
a processor for launching the Web site data retrieval driv 
fir in response Lo a user inputting a command to access the World Wide We 
b, wherein the Web site retrieval driver, upon being launched, (1) launc 
hes the Web reader to connect to the World Wide Web via said connection, 

(2) rtftrievKs the W«b site address inforrnat ion and Web site commands, ( 
3) instructs the Web reader to access the Web site based on the Web site 

address information and Web site commands, (4) downloads Web site data 
from the Web site based on the Web site commands, wherein the data is do 
wnloaded with rttfKrRncH to a linked list so hk to avoid hypermedia-links 

that form loops and so as to avoid repetitious downloading of data that 

has already been downloaded, (5) stores the Web site data in a linear d 
ocutnent, (6) repeats steps 1 through 5 until all addresses in the attired 

Web site address information have been accessed, and (7) formats the li 
near document into the personalized document based on the format informa 
tion. 

19. The WhIi sitn data retrieval Ky Kt nm ol Claim 18, whwrH 
in the Web site address inforrnat ion, the Web site commands, and the form 
at information stored in the memory form a personal i zed-news-prof i le. 
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20. The Wnh r i t b data retrieval system of Claim 18. furth 
er comprising a printer for printing the personalized document. 

21. The Wet) site data retrieval system of Claim 18, where 
in the personalized document represents a personalized newspaper. 

22. The Web site data retrieval system of Claim 18, where 
in Lhe personalized document represents a personalized magazine. 

23. The Web site data retrieval system of Claim 18, where 
in the personalized document represents a personalized book. 

24. Computer executable process steps stored on a compute 
r-readab3e medium, said steps for accessing World Wide Web sites for ret 
rieving daLa at the sites and for formatting the data into a personalize 
d document, said steps comprising: 

a connecting step to connect to the World Wide Web; 

a retrieving step to retrieve user-defined Web site addres 
s inl'urroat iun, user-defined Web site cuumands, and user-defined formaLLi 
ng commands; 

an activating step to activate a Web reader so as to acces 
a a Web site based on tlitj user-defined Web site address information and 
retrieving data from within the Web site based on the user-defined Web s 
ite commands; 

a downloading step to download the retrieved Web site data 
from the accessed Web site into an extracted data tree; 

a flattening step to flatten the extracted data tree into 
a linear document; 

a step to repeat the downloading step and the flattening s 
tcp until all addresses in the user-defined Web site address information 
have been accessed; and 

a formatting step to format the stored data into the pcrso 
nalized document based on the user-defined formatting commands. 
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25. The computer executable process steps of Claim 24, fu 
rther comprising a spooling step to spool the personalized document to a 
n output device. 

26. The computer executable process sLeps of Claim 20, wh 
erein the output device is a printer. 

27. The computer executable process steps of Claim 25, wh 
erein the output device is a display. 

28. The computer executable process sLeps of Claim 24, wh 
erein the user-defined Web site commands include commands for selecting 
data based on a structure of the Web site. 

2D. Tim computer Kxecutable prmiBRH h t rjj k til' Claim 28, wh 
erein the user-defined Web site commands also include commands for eelec 
ting data based on a content of the Web site. 

30. An apparatus for retrieving news articles from on-lin 
c news services on the World Wide Kcb and formatting the news articles i 
nto a personalized newspaper, the apparatus comprising: 

first storage means for storing (1) a persnnal-news-prnf i 1 
e which comprises addresses data and command data for accessing data fro 
ra a Web site, and (2) newspaper format commands; 

retrieval means for retrieving the stored personal-news-pr 
ofile and accessing data stored therein; 

activating means for activating a Web reader to contact a 
Web site based on address data stored in the personal-news-profile; 

downloading mRans lur downloading news articles from the c 
ontactcd Web site based on command data stored in the personal- news-prof 
ile; 

second storage means for storing the downl oadf-id nnws artic 

les; ajid 

formatting means for formatting the stored news articles i 
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nto the personalized newspaper based on the newspaper format commands st 
ored in the persunal-news-pr of i le. 

31. The apparatus of Claim 30, further comprising spool in 
g means for spooling the personali2ed newspaper to a printer. 

32. A method for formatting data from a hypermedia docuroe 
nt into a personalized document, comprising the steps of: 

a location specifying step Lo specify a location of the hy 
permed i a document; 

a type specifying step to specify the type of the hypermed 

ia document; 

a scope specifying step to specify the scope of data to rc 
trieve from the hypermedia document, wherein the scope is based on a str 
ucture of the hypermedia document; 

a. format specifying ster to specify a format for formattin 
g the data retrieved from the hypermedia document into the personalized 
document; 

an accessing step to access the hypermedia document found 
at the location specified in the location specifying step; 

a retrieving step to retrieve data from the hypermedia doc 
umcnt accessed In the accessing step, wherein the data is retrieved in a 
ccordance with the type specified in the type specifying step and in acc 
ordance with the scope specified in the scope specifying step; and 

a formatting step to format the data retrieved in the rctr 
ieving step into the personalized document, wherein the data is formatte 
d in accordance with the format specified in the format specifying step. 

33. The method of Claim 32, further comprising a printing 
step to print the personalized document. 

34. The method of Claim 32, wherein the location specifie 
d in the location specifying t;t«p i» a filename. 
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35. The method of Claim 32, wherein the location specif ie 
d in the location specifying step is a uniform resource locator for the 
World Wide Web. 

36. A method of processing a hypermedia document, compris 
ing the steps of: 

accessinp, the hypermedia document; 

extracting addresses from the hypermedia document; 

storing the aridrfiRRRs Rxt.rantfiri from thm hypRrmnrlia riorums 
nt in a container in a memory; 

activating a processing function to process data stored at 
the addresses stored in the container; 

downloading the data stored at the addresses in the contai 
ner into the memory; 

extracting predetermined data from downloaded data in acco 
rdance with predetermined configuration information; 

formatting the jjrhritttKrmi nnri data in acnorrianctt with urtfrin 
fined formatting settings to generate a formatted document; and 

processing the formatted document in accordance with the p 
rocessing function. 

37. A method according to Claim 36, further comprising a 
step of previewing the formatted document prior to processing the format 
ted rioctimKnt. 

38. A method according to Claim 37, further comprising th 

e steps of: 

changing the formatting settings after previewing the docu 
merit and before processing the formatted rionument in accordance with the 
processing function; 

re-activating the processing function; and 

re-formatting the data in accordance with changed funnatti 
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ng settings to generate the formatted document. 

39. A method according to Claim 3G, wherein the addresses 
are stored in the container in the order that the addresses are input i 

n(.n the container; and 

wherein the processing function processes the predetermine 
d data in the order that the addresses are stored in the container. 

40. A method according tD Claim 39, further comprising th 
e step of rearranging the addresses stored in the container by dragging 
and dropping the addresses within the container. 

41. A method according to Claim 36, further comprising a 
step of inputting the formatting settings and configuration information 
via a graphical user interface. 

42. A method according to Claim 41, wherein the graphical 
user interface comprises plural processing icons, one of which activate 

a Lhe processing function. 

43. A method according to Claim 42, wherein the g 
raphical user interface is displayed in plural modes. 

44. A method according to Claim 43, wherein Lhe plural mo 
des comprise (1) a fully-functional mode in which the graphical user int 
erface displays formatting fields, processings options, menus and the pro 
cessing icons, and (2) a minimizing mode in which the graphical user int 
erface displays only the processing icons. 

45. A method according to Claim 44, wherein the graphical 
user interface displayed in the minimizing mode is displayed during bro 

wsing the hypermedia document. 

46. An apparatus for processing a hypermedia document, co 

mprising: 

a Web read which accesses the hypermedia document.; 

means for extracting addresses from the hypermedia documen 
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t; 

a memory including a container which stores the addresses 
extracted from the hypermedia document; 

a graph inn] us«r inter face having processing icons which a 
ctivate at least one processing function to process data stored at the a 
ddresses stored in the container; and 

processing means which (1) downloads the data stored at th 
c addresses stored in the container into the memory, (2) extracts predet 
ermined data from downloaded data in accordance with predefined conf iffiir 
ation spittings, (3) formats the predRtfirminRri data in accordance with pr 
edefined formatting settings to generate a formatted document, and (4) p 
rocesses the formatted document in accordance with the processing functi 
on. 

47. An apparatus according to Claim 46, further coraprisin 
g previewing means for previewing Lhe formatted document prior Lo proces 
sing the formatted document. 

48. An apparatus according to Claim 46, wherein the addre 
sses are stored in the container in the order that the addresses are inp 
ut into the container; and 

wherein the processing function processes the predetermine 
d data in the order that the addresses are stored in the container, 

43. An apparatus according to Claim 48, further coraprisin 
g dragging and dropping means for dragging and dropping the addresses li 
sted in the container in order to rearrange the addresses in the contain 
er. 

50. An apparatus according to Claim 46, further coraprisin 
g inputting means for inputting the formatting settings and configuratio 
n information ria a graphical user interface. 

51. An apparatus according to Claim 50, wherein the graph 
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ical user interface comprises plural processing icons, one of which acti 
vates to the processing function. 

52. An apparatus according to Claim 51* further c 
umjirioing display means for displaying the graphical user interface in p 
lural modes. 

53. An apparatus according to Claim 52, wherein the plura 
1 modes comprise (1) a fully-functional mode in which the grarhica] user 

interface displays formatting fields, processing options, menus and the 

processing icons, anil (2) a minimi/.ing mode, in which the graphical user 

interface displays only the processing icons. 

54 An apparatus according to Claim 53, wherein the graph 
iual user intHrTafiH ilihplayHil in th« minimizing mode is displayed during 

browsing the hypermedia document, 
a. DETAILED DRSHR T PTI RN HF INVENTION 

Field Of The Invention 
The invention relates to a data retrieval system which aut 
omatically traverses hypermedia documents on a computer network and auto 
matically retrieves infnrmation from those documents based on a match be 
tween the structure of the documents and a personalized data retrieval s 
tructure. More particularly, the invention can retrieve articles from a 

news service, from a magazine service, ur from a combination of both se 
rviccs which arc located on the World Wide Web, a private computer netwo 
rk that supports hypermedia links, or any other hypermedia-linked coroput 
er system. 

For example, . there exists a Web site for retrieving news a 
rticles from the Net; York Times and a Web site for retrieving articles f 
ram People magazine. The retrieval system of the invention can traverse 
through such Web sites and selecL articles based tin a personalized data 
retrieval structure. The personalized data retrieval structure may inc 
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I ude .commands to retrieve a full texL of Lhe front page only, headlines 
of the business section, headlines of the stock section and sports secti 
on, etc. In addition, the personalized data retrieval structure may inc 
lude (juntKiit-ljasHiJ ruins tti rntriRVH article;; with curtain keywords, to 
exclude articles with certain keywords, or to include articles based on 
a rule-based content analysis. The invention also provides a method for 
synthesizing all retrieved news articles and printing the synthesized n 
ews articles into a newspaper-type format in which each of the articles 
is arranged based on a user's predefined iayouL. 

While the above example is in the context of the Web, hype 
media documents can reside on other types of networks besides the Neb, 
such as an intranet. An intranet is a private computer network that is 
not connected to outside computer networks. Por example, a company 1 s ow 
n computer network could be an intranet with hypermedia documents on it. 

For brevity, the following discussion is marie wii.li respect to the. Wurl 
d Wide Web. However, it should be understood that the invention applies 
equally well to any type of computer network that contains hypermedia d 
ocuroents, such as an intranet, different hypermedia-linked computer netw 
orks that reside on the Internet other than t tits WhU etc. 

A hypermedia document on the Web can span multiple Web sit 
es. Such documents can be newspapers, news articles, magazines, caLalog 
k. manuals, mraniiranria, and the like. For brevity, the following discuss, 
ion is made with respect to sources of news information, However, it sh 
ould be understood that the invention applies equally well to any other 
type of hypermedia document. 

Description Of The Related Art 
The World Wide Web is an on-line source of hypermedia docu 
raents containing hypermedia text and images that act as links to other d 
ocuments, Web sites, etc. As a result, documents on the Web are not ors 
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anizerf sequent i a I ly. Rather, a user is automatically linked to other do 
cuments or fteb sites to complete the viewing of a document by selecting 
a hypermedia link, such as a text link nr an imngc link, within the docu 
mcnt. Accordingly, an entire document cannot he viewed by scrolling thr 
ough text. 

One popular use of the fteb is on-line publication and dist 
ribution of magazines and riHwspapRrs. Currently, many Web news services 
, such as the Kew York Times, allow the user to define keywords of inter 
est and to receive news information, daily or hourly, thai, coniains text 

matching the keywords. The news information can then bs delivered to t 
he user's computer via modem or E-mail. However, most Web news site new 
spapers, like the, Nkw York Timer,, include too much information, most of 
which has no interest to the user since the information is retrieved bas 
ed only on a keyword niaLch. 

Other sources of news information are provided through inf 
ormatton suppliers like "Individual Inc." Individual Inc. supplies user 
s with a brief summary of the top twenty most relevant articles based on 

a user' s predefined keywords. This Ruhnr:ription news service allows th 
c user to specify five to ten areas of interest based on keywords, which 

are then prioritized by the user. The information service searches the 

Web for magazines and newspapers which contain any of the keywords. Ba 
sed on the keyword searches, twenty of the most relevant articles are se 
lected, compiled into a brief one-page summary, and transmitted to the u 
ser via facsimile for the user's review. HnwHve.r, in order to review an 

entire document rather than the summary, the user raust log onto a speci 
fic Web site containing the document in order to retrieve and review the 

document. 

There are yet other services which permit the user to pers 
unalizR a newspaper to be displayed at the user's terminal by storing li 
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nks Lo various news articles from variuus ne*s smin;«s tin tlm Web. Fur 
example. CRAYON "Create Your Own Newspaper" permits a user to select spe 
cific sections from among links Lo over LwenLy-five different, nn-1 i n« mr 
wspapers, and to compose the selections into a personalized newspaper. 
Using CRAYON, it is possible to compose a personalized newspaper contain 
ing, for example, links to the international section of the New York Tim 
cs, the business section of the Wall Street Journal, and the sports sect 
ion of the Chicago Tribune. The HTML (hypertext markup language) source 
file fur this newHpapHr ih then stored to rnaHK media storage for later 
use. 

While the forgoing news and information services provide c 
onvenient ways to keep updated on the news, they do not allow a user to 
access and view the news in the way that people naturally read a real-wo 
rid newspaper. Namely, people naturally read a newspaper by scanning th 
e pages of sections that they find interesting and then reading those ar 
tides that grab their attention. In other words, people use a structur 
al approach to decide what pages to look at initially (e.g., the first p 
agtt uf t he business anil World sections, and the comics page of the Arts 
section). They then scan the selected pages for articles. 

In sum, conventional news and information services do not 
allow a user Lo access data from a hypermedia document un tlia basis of t 
he structure of the document, and then to format that data in a manner t 
hat allows the user to scan and read the data in a natural fashion. 

SUMMARY OF THE INVENTION . 

The invention addresses the above deficiencies in the art 
by accessing at least one hypermedia document, retrieving data from the 
hypermedia document into an extracted data tree, with the data retrieved 
based on a structure nf t\\r. hypnnnedia document, flattening the extract 
ed data tree into a linear document, and formatting the linear document 
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into a hirraatt.MiI document. 

In another aspect, the invention creates a personal-news-p 
rufile for retrieving daLa from a hypermedia- linked computer network. T 
he hypcrmcdia-1 inked computer network is accessed, a learning modn ia st 
arted, the hypermedia-linked computer network is traversed with commands 
. at least one rule is extracted frora the commands, and the rule(s) is c 
ompiled into the persona l-ncws-profi le. 

In yet another aspect, the invention creates a personaliza 
tion profile for a Web site retrieval data retrieval system. Data and c 
ommands are input tn access the Wnrlrl Wide W«h and a connection is made 
to the World Wide Web. A Web reader is launched, and the Web reader acc 
esses the Web via the connection. In response to user commands, a learn 
ing mode is entered into. Commands are sent tu traverse the World WiriB 
Wetx and at least one rule is extracted from the commands. The rute(s) 
is uumpiled inio a personalization profile, which is stored. 

In yet another aspect, the invRntion t utr i h v bh articles I'r 
ora a hyperniedia-l i nked computer network and formats the articles into a 
personalized newspaper. A stored personal-news-profile is retrieved. T 
he personal news profile includes address data for a site on the hypcrmc 
dia-1 inked computer network, command data for accessing data from the si 
Le, and newspaper layout commands. The site is accessed based on addres 
s data stored in the personal-news-profile, and articles at the site arR 

downloaded based on command data stored in the personal-news-profile. 
The downloaded articles are flattened into a linear document, and the H 
near document is formatted into the personalized neswHpiipeir according to 
newspaper layout commands stored in the personal-news-prof i le. 

In yet another aspect, the invention retrieves data from a 
World Wide tfeh si t ft and formats the data .into a persona] i zed document. 
A Web site data retrieval driver which includes a Web reader, stored W 
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eb site address information, stored Web site commands, and stored format 

information ia ncuesstsd. The i n vent i un (I) launch^; the Web reader to 
connect to the World Wide Web via a connection to the Web, (2) retrieves 

the Web site address information and Web site commands, (3) instructs t 
he Web reader Lo access Lhe Web site based on the Web site address infor 
mat ion and Web site commands, (4) downloads Web site data from the Web s 
i te based on the Web site commands, wherein the data is downloaded with 
r«rKr«n(;« to a linked list so as Lo avoid hypermedia-links that form loo 
ps and so as tD avoid repetitious downloading of data that has already b 
een downloaded. (5) stores the Web site data in a linear document, (6) r 
epeats steps 2 through 5 until all addresses in the stornd Web site addr 
ess information have been accessed, and (7) formats the linear document 
into the personalized document based on Lhe format information. 

In yet another aspect, the invention accesses and retrieve 
s data at World Wide Web sites and formats the data into a personalized 
document. The invention connects to the World Wide Web, retrieves user 
defined Web site address information, user defined Web site commands^ an 
d user defined formatting commands, and activates a Web reader so as to 
access a fteh sitR based un the user defined Wnh Kite address inl'urmat ion 
The Web reader is used to download data from the Web based on the use 
r defined Web site commands, and the data is downloaded into an extracte 
d data tree. The downloading continues until all addresses in the user 
defined N«b sit« address information have Imen ar;cBHsed. Thti extracted 
data tree is flattened into a linear document, and the flattened docuraen 
t is formatted into the personalized document based on the user defined 
formatting commands. 

In yet another aspect, the invention retrieves news articl 
es from on-line news services on the World Wide Web and formats the news 

articles into a personalized newspaper. The invention aLnran a persona 
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1-news-prof ile which comprises addresses data and command data for acces 
sing data from a Web site and newspaper format commands, retrieves the s 
tored personal-news-profile and accesses the data stored therein, activa 
t.es a Web Trailer to contact a. Well nit.H based on fiddrttss data stored in t 
he personal-news-profile, downloads news articles at the contacted Web s 
ile based on comaiand data stored in Lhe personal-news-prof i le„ stores Lh 
e downloaded news articles, and formats the stored news articles into th 
e personalized newspaper based on the newspaper format commands stored i 
n the personal-news-profile. 

In yet another aspect, the invention formats a hypermedia 
document into a personalized document. A location of the hypermedia doc 
ufncnt is specified, a type of the hypermedia document is specified, a sc 
ope of data to be retrieved from the hypermedia document is specified, w 
herein the scope is based on a structure of the hypermedia document, and 
a format is specified for formatting the data retrieved from the hypcrm 
edia document into the personalized document. The hypermedia document f 
ound at the specified location is accessed, data is retrieved from the h 
ypermedia document in accordance with the specified hypermedia document 
type and in accordance with the specified scope, and the data is forraatt 
ed into the personalized document in accordance with the specified forma 
I. 

In yet another aspect, the invention is a system for proce 
ssins a hypermedia document. The system accesses the hypermedia documen 
t extracts addresses from the hypermedia document, and stores the addre 
sscs extracted from the hypermedia document in a container. The system 
activates a processing function to process data stored at the addresses 
stored in the container, downloads the data stored at the addresses stor 
ed in the container into a memory, and extracts predetermined data from 
downloaded data in accordance with predetermined configuration informati 
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on. The predetermined data is then formatted in accordance with predefi 
ned formatting settings to generate a formatted document, and the format 
ted document is processed in accordance with the processing function. 

In preferred embodiments, the system inputs the formatting 

settings and configuration information via a graphical user interface. 
The graphical user interface comprises plural processing icons, one of 

which activates the processing function. By virtue of the graphical us 
er interface, a user can interactively set a document's format and chang 
e that format should a change be desired. 

In particularly preferred embodiments, the graphical user 
interface is displayed in plural mooes. The plural modes comprise (1) a 

fui Jy-f unctional node in which the graphical user interface displays fo 
rmatting fields, processing options, menus and the processing icons, and 

(2) a minimizing roode in which the graphical user interface displays on 
ly the processing icons. Typically, the graphical user interface displa 
yed in the minimizing mode is displayed during browsing the hypermedia d 
ocument. By displaying the graphical user interface in plural modes, th 
e present invention facilitates operation of the invention during browsi 
ng of the hypermedia document. 

This summary has been provided so that the nature of the i 
nvenLion may be understood quickly. A more complete understanding of th 
e invention can be obtained by reference to the following detail hi! descr- 
iption of the preferred embodiments thereof in connection with the attac 
hed drawings. 

11KTA1 LHI) I)KSI:HIFTII)V IIP THR PR 

EFBRRBD EMBODIMENT 

Figure 1 is a view showing the outward appearance of a rep 
resenLalive embodiment of the invention. Shown in Figure 1 is computing 
equipment 1, such as a Macintosh or an IBM PC or a PC-compatible coraput 
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er, having a windowing environment, such as Microsoft Windows. Provided 
with computing equipment 1 is display screen 2. such as a color monitor 
or a raonochromat ic monitor, keyboard 3 for entering text data and user 
commands, and a pointing device such as mouse 4 for pointing and lor man 
ipulating objects displayed on display 2. Computing equipment 1 also in 
eludes a mass storage device such as disk drive 5. IiimgR dal.a can be in 
put into computing equipment 1 from a variety of sources such as a netwo 
ik interface 11a or from external devices via f ac&imaie/njodem interface 
6. Network interface 11a is used to connect computing Rquipmftnt 1 tn a 
local area network (LAN) or to a wide area network (WAN) such as the Wor 
Id Wide Weh. 

Figure 2 is a detailed block diagram showing the internal 
construction of computing equipment I. As shown in Figure 2, computing 
equipment 1 includes central procuring unit (CPU) B interfaced with com 
puter bus 9. Also interfaced with computer bus 9 is printer interface 1 
Q, fax/modem interface display interface 11, network interface 11a, k 
cyboard interface 1JJ, mouse interlace 13, main injury 14 and disk drive 
5. 

Main memory 14 interfaces with computer bus 9 so as to pro 
vide random ancess memory storage fur use by CPU B when executing an app 
lication such as personal news profile editor 16 or Web printer 17. Mor 
e specifically, CPU 8 loads these software applications from disk drive 
5 into main memory 14 and executes the sofLware applications out of main 
memory 14. In . accordance with user instructions, stored application pr 
□grams are activated which permit processing and manipulation of data. 
Typically, the software applications stored on disk drive 5, such as per 
Bunal-news-prof i le editor 16, Web printer IT, and HTML formatter 18, hav 
e been stored on disk drive 5 by downloading the software applications f 
rom a computer-readable medium such as a floppy disk or CD ROM, or by do 
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unloading the software applications from a computer bulletin board. 

Disk drive 5 stores data files which can include text file 
s and image files, in compressed or uncompressed format, and stores soft 
ware application files such as those noted above. The software applicat 
ion files include Windows appl icatiuna, DOS application, and personal ne 
ws retrieval files 15. Personal news retrieval files 15 include persona 
1-news-prof ile editor 16, Web printer 17, HTML formatter 18, personal-ne 
ws-prof ile(s) 19, and site profile(s) 20. The detailed functions of per 
sonal news retrieval files 15 will be discussed below, after a brief ove 
rview of thy uperatiun uT the personal new retrieval system. 
Overview of Document Retrieval 

Figure 3. comprised of Figures 3A to 3D. illustrates the o 
peration of a representative embodiment of the invention. Figure 3A is 
a graphical representation of a typical Web site 21 with news informatio 
n contained therein. Within Web site 21 is homepage 22 with links to in 
dices such as headings 23, which are in turn linked to articles 21 vSura 
c of articles 24 are linked to other articles. As article H 26 resides 
on another Web site, link 25 is a cross-site link. Link 25 illustrates 
how a single hypermedia document, represented by homepage 22, can traver 
se multiple Heb aites. 

In order to retrieve news from Web site 21, the invention 
first traverses Web site 21 to retrieve data according to user-defined r 
ules. As will be discussed in more detail below, thcac rules can bo has 
ed on the structure of Web site 21, or on the structure of Web site 21 a 
nd its contents. The data is retrieved into an extracted data tree, whi 
ch preserves the organization of the data as sliuwn in Figure 3B, but in 
which some links are excluded. 

The organization of extracted data tree 27 has several fea 
tures. First, extracted data tree 27 has root 28 which can have child n 
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udes Tut urie ur morn sites 29. which in turn can have index nodes 30 whi 
ch correspond to indices/headings 23, articles nodes 31. and the like. 
Second, extracted data tree 27 is a true tree, with no loops (i. eL , cycl 
it: paths) thfirein. Kor example, Figure 3A shows a loop from homepage 22 

to index node tfl, to article C ? and then back to homepage 22. This loo 
p is removed when creating extracted data tree 27. 

Second, the organization of extracted data tree 27 depends 

on how the Web sites are traversed, and not on the Web sites' actual la 
youts. Thus, article H 26 appears under index node $3 (under site #1), 
indicating that the news retrieval system accessed article H 26 from sit 
e tl via cross-site link 25. 

Finally, as noted earlier, certain articles have been exel 
uded from extracted data tree 27 due to the structure of Web site 21 or 
possibly a content of indices/headings 23 and articles 24. For example, 

articles R and G Itavn hmn Hxclinfed frewn extracted data tree 27. 

According to the invention, extracted data tree 27 is flat 
tened into linear document 32, as shown in Figure 3C, possibly with refe 
renue to more exclusion rules. Linear document 32 is simply a contirtuou 
s document with information from extracted data tree 27 embedded therein 

Finally, 1 inear document 32 is formatted according tu user 
specified (or default) formatting instructions into formatted document 
33. shown as. a stylized personal newspaper in Figure 3D. Formatted docu 
ment 33 has various fonts and/or colors for sUr lahelR, i ndi r:HK/hHading 
s, articles, and the like, Furthermore, formatted document 33 is broken 
down into pages. 

Note that in alternate embodiments of the news retrieval s 
ywtera, certain stages of the above transformation from Web site 21 to fo 
rmatted document 33 can be skipped. For example, data from Web site 21 
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can be retrieved directly into flattened ducuroent 32, as long as a rccor 
(I of the organization of the data is maintained (possibly in a separate 
linked list) so as to avoid downloading the same article twice and so as 

to avoid loops in tlm organization of Web site 21. Alternatively, extr 
acted data tree 27 can be directly formatted into formatted document 33. 

In any case, the basic operation of Uih invention remains the same: t 
he news retrieval system traverses a hypermedia Document on the Web, ext 
racts data according to user-defined information, and formats the data i 
nto a personalized newspaper. 

As mentioned in the above discussion, various user-defined 

rules and other information (such as formatting information) are involv 
ed in the news retrieval process. That user defined information is stor 
ed in personal-news-prof ile(s) 19, the definition of which is described 
next. 

Defining a Personal-News-Profile 

Figures 4 and 5 illustrate the process by which persunal-n 
ews-profile 19 is defined. To create personal-ncws-prof i le 19, personal 
-np^R-prof i le editor 16 communicates with personal -news-prof i le 19, site 
profile 20, and Web reader 34. 

Personal-news-profile 19 contains information as to what s 
ites to access for creating a personalized newspaper, what sections to r 
etrieve from those sites, rules to be used to determine what data to ext 
ract from the sections and the articles therein, rules to determine how 
to exclude links, and newspaper format information. A sample personal-n 
ews-profile is shown in Appendix 1. 

Site profile 20 includes general site information that is 
not specific to a particular user. For example, site profile 20 could c 
ontain information such as full site addresses, sections within a site, 
non-user specific passwords, etc. Sample site profiles arn nhnwn in App 
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endix ]. Because generaJ site information is stored in site profile 20, 
personal-news-profile 19 can refer to the general site information with 
reference to site profile 20, saving space in the pBVKtinal -news-prof i le 
For example, as shown in Appendix 1, personal-news-profile 19 can ref 
er to a site number 1. Site profile 20 indicates that site number 1 is 
the "San Jose Mercury News, " with a homepage at w http//www. sjmcrcury. com 
This construction also centralizes general site information. Thus, 
if a site address changes, only site profile 20 needs to he changed to 
update all personal-news-prof i Jes 19 on the system. 

Web reader 34 is an application program or program module 
that communicates with the Web via Web server 35, In response to comman 
ds from personal-news-prof i ie editor lfi, Web reader 34 will auunss ttin W 
eb, traverse hypermedia documents on the Web, retrieve data from the doc 
uments, and return the retrieved data to personal-news-profile editor ]fi 

As shown in Figure 4, personal-news-prof ile editor 16 incl 
udes four modules: site driver 36* Web reader interface 37, profile mana 
ger 3& and format editor 39. 

Web reader interface 37 interfaces persanal-news-prof ile e 
ditor 16 to Web reader 34. Site driver 36 interacts with Web reader 34 
via Web reader interface 37 Lo provide an shstract interface to each ind 
ividual Web site. More specifically, site driver 36 instructs Web rEade 
r 34 to access various Web sites and to retrieve data from those sites. 

Thereafter, site driver 36 receives that data and builds site profile 
20 therefrom. The data can also be used to update an existing site prof 
ile. 

In building mt« prni i le 20, site driver 36 translates the 
structure of each accessed Web site to a uniform structure defined in s 
ite profile 20, and stores data retrieved therefrom in si Le profile 20. 
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By translating different Web gites. some of which may havp different s 
tructures, into a single uniform structure and storing data therefrom in 
that structure in site profile 20, the present invention faciliLal.es ac 
cess to information from different Neb sites, and thus reduces overall p 
rocessing time. 

Profile manager 38 maintains document templates that speci 
fy liuw to fumiat a personal ized newspaper. Predefined document template 
s exist. In addition, format editor 39 allows a user to specify persona 
lized templates for formatting a newspaper, either by editing existing t 
cmplates or by creating new ones. In any case, each document template s 
pecifies page layout information, font information, style information, c 
oJors, etc. for the titles, indices/headings, subheadings, tRxt and the 
like for a personalized newspaper. 

Sample code for personal-news-praf i le editor 16, site driv 
cr 36, and profile manager 38 is included in Appendix 3A. 

Figures 5A and 5B are flow diagrams describing the operati 
on of personal-news-profile ediLor 16 in more detail. Figure 5A shows t 
he operation of persona 1 -news-profi 1 e editor 16 in defining the parts of 
personal-news-profile 19 relating to accessing Web sites and retrieving 
data from those sites. 

In step S5D0 of Figure 5A, pcrsonal-ncws-prof ile editor 16 
is launched by a user. In step S501, the editor launches Web reader 34 
The user's personal I.D. is then retrieved in step S502. If a person 
al-news-prof ile already exists for that I.D., utyp S503 directs flow to 
step S504, where the user is given the option of skipping to the format 
editor. Otherwise, personal-news-prof ile editor 16 enters a "learning m 
ode" in step S505. Once in the learning mode, personal -news-prof i le edi 
tor 16 proceeds tn step S50b\ where it accepts a Web command (i.e., a co 
mmand to traverse a hypermedia link) from the user and forwards the Web 
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command to the Web reader by means of site driver 36. Site driver 36 ma 
intains a hierarchical log of Web sites visited by Web reader 31 In st 
ep S507, personal-news-profile editor IB creates an extraction rule from 

the Web command. This rule will allow the news retrieval system to lat 
er duplicate the user's selection criteria in browsing (clicking on hype 
rlinks within) a Web site. 

The ruin specifies, at the least, structural criteria for 
duplicating the traversal of the Keb site. For example, if a iihht acces 
shh rill articles under a particular index/heading, the rule will specify 

that all articles under that index/heading should be retrieved. 

In one embodiment of the invention, the rule can also incl 
tide conlent-based criteria (i.e., keyword-based criteria) accepted from 
the user. These content-based rules nan, for Bxaraple: (1) require certa 
in words to be in an article, (2) exclude articles with certain words, ( 
3) require certain boolean combinations of words, (4) rank articles that 

are selected based on structural nriteria, with the ranking based on ke 
ywords, and then require the selection of the articles with the highest 
Tanking(s), or (5) Bxclude certain types of articles such as advertiseme 
nts. 

Examples of the syntax for the structural and nnntent-uasB 
d exclusion rules are shown in Appendix 2. Several different types of r 
ulfis are showa Same simply limit the traversal of a Web site to a cert 
ain number of links. Others are date and liny word based exclusion rules. 

One particularly flexible rule indicates that articles should be ranke 
d based on a keyword analysis and the top scoring articles should be cho 
sen. Other rules include "flattening" rules. These rules control the f 
lattening of the extracted data tree, as will be explained in more dctai 
1 below. 

At the lca9t, the rule includes structural information abo 
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ut the user's selection (i.e.. first page, first document, all links, fit 
c. ), necessary password information, browser commands, and the like. Th 
e rule can also include a pointer or a reference Lo site profile 20 and 
the appropriate information therein. General (non-user specific) inform 
ation is used by site driver 36 to maintain site profile 20. In this ma 
nner, address information and passwords common Lo multiple users can be 
maintained in site profile 20, as discussed above. For example, site dr 
iver 30 will store commands or hyperlinks to other documents in a fleb pa 
ge in the rule, but will not store a Web site's full address in the rule 
That address information is stored in site profile 20. 

In step S508, rule data definins the rule created from a W 
eb command (s) is stored in an extracted data Lree such as extracted data 
tree 27 in Figure 3B. This data tree is a linked list that reflects th 
e organization of the data retrieved from the Web. In step S509, flow r 
eturrui to step S506 for the next Weh command unlesR the user is done (i. 
e. , the user signs off the Web site), in which case flow proceeds to ste 
P S510. 

At this point, the creation of the peraonal-news-praf i le h 
as proceeded much like the creation of a macro common to word processing 
programs, except that site profile 20 has been used to minimize storage 
requirements and to centralize general Bite information. In order to m 
inimize storage requirements further and in order to make the news retri 
eval system more flexible and efficient, the extracted rules are now com 
piled to remove redundant links, multiple visits to the same site, and t 
he like. This occurs in step 5510, and the resulting compiled rules bee 
ome the first part of personal-news-profile 19. 

Alternatively, persona 1-news-proJ* i le editor 16 may he invo 
ked as a graphical user interface which allows a user to edit a previous 
ly stored personal-news-profile or to specify document composition prefe 
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rences, for example, by specifying news sites, headline articles only, k 
eywords, etc. In either case, the result is personal-news-profile 19, w 
hich comprises a listing of Web site pointers as well as extracted rules 
for traversing through a Web site or sites. 

For a better understanding of the above, sample personal-n 
ews-profiles and sample site profiles are provided in Appendix 1 as note 
d above. 

Next, operation proceeds to give the user an option to mod 
ify a custom newspaper template, as shown in Figure 5B. Tn step S511, i 
t is determined if a newspaper template has been defined and stored in p 
ersona 1 -news-prof i le 19. If a newspaper template has been defined, step 
S512 gives the user the option to edit the template or to proceed to st 
ep S520. If the user chooses to edit the template or if no newspaper te 
mplate has been defined, flow proceeds to step S513. 

Step S513 gives the user the option of creating a custom t 
enjplate or usins a predefined template. If the user wants to use a pred 
efined Lemplale, step 8514 gets the specified prniltil" i rmd template, which 
is added to the personal news profile in step S519. Otherwise, flow pr 
oceeds to step S515, where format editor 39 is invoked. 

Format editor 39 has a graphical user interface that provi 
des the user with a number of formatting options. In step S516, format 
editor 39 allows the user to define which newspaper sections are to be p 
rioted in the newspaper, which Web site's news article are to be placed 
in each section, and/or how each page is to he laid out. In this regard 
, the user can specify which Web site's news articles are to be used as 
a front page, which Web site's news articles are to be used as a busines 
s page, which Web site's news articles are to be used as a sports page, 
eta In addition, in step S516, the user can define where each index/he 
ading should be listed, as well as what sub-headings should go on each p 
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age. 

In step S517, format editor 39 allows the user to define t 
he fonL styles for indices/headings, sub-headings, bylines and actual te 
xt of news articles. In step S518, format editor 39 prompts the user to 

define index/heading colors, title colors, etc. In this regard layout 

editor 39 is capable of determining the types of fonts and colors avail 
able to the user based on the system's printer capabilities. 

Once all of the information is gaLhered for the custom ten 
plate, the format editor adds the information to personal-news-profile 1 
9 in step S519. Alternatively, profile manager 38 may also store the cu 
slom format as a template in a curaroun area fur use by uther users. In t 
his case, only a pointer or reference to the custom template is stored i 
n personal-news-profile 19. 

In step personal-news-profile editor 16 prompts the 

user to set an automatic newspaper delivery time and method (i.e., print 

or store on disk drive 5 for later printing). These settings are added 

to personal-news-profile 19. More specifically, in the case that a use 
r' s computer is continuously supplied with power, the Web news retrieval 

system can be launched automatically at a designated time. The system 
will retrieve articles from the Web sites which are listed in personal-n 
ews-profile 19. Upon retrieving the news articles, the articles will be 

formatted based on the newspaper template in personal-news-profile 19. 
The formatted personalized newspaper nan then he either printed ur sto 
red for later viewing. In the case that a time is not set for newspaper 

delivery, the user can execute the Web news retrieval system program at 

any time. 

Unce personal-newa-prof i lc 19 has been created, the Web ne 
ws retrieval system, upon being launched, can traverse Web news sites an 
d build a personalized newspaper by automatically retrieving various new 
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s articles from the Web news sites and print the rows articles based on 
the newspaper template indicated in personal-new9-prof He 19. A descrip 
tion of how the Web news retrieval system of the invention performs this 

function is described next. 
Retrieving a Document Using a Personal-News-Profile 

Figure 6 is a representational block diagram of the manner 

by which the invention retrieves articles from the Web according to per 
sonal-news-prof i le 19. (Figure 6 also shows the manner by which the ret 
rievHil articlKs htr Mattered into a linear document and formatted. The 
se functions are discussed in greater detail in the next section of this 

application.) 

As shown in Figure 6, Web printer 17 is responsible for re 
trieving news articles, fteb printer 17 is an end-user application that 
communicates with personal-news-prof ile (s) 19, site profile 20, Web read 
er 34, and output interface 40 in order to perform this function. 

Keb printer 17 looks at personal-news-profile 19 to determ 
ine which Web sites to access and which data La retrieve from thuse sits 
R. tteh printer 17 also looks at site profile 20 for general site inform 
at ion. According to the information in personal-news-profile 19 and sit 
e profile 20, Web printer 17 instructs Web reader 34 to connect to the W 
eb via Web server 35 in order to access various Web sites and tn retriev 
r data from those sites, Web reader 34 sendg the retrieved data to Web 
printer 17, and Keb printer 17 uses the data to build an extracted data 
tree. As will be discussed in grater dfitail in the next section of the 
application, Web printer 17 then flattens the extracted data tree into 
a linear document and formats the linear document for output via output 
interface 40. 

As shown in Figure 6, Web printer 17 includes four program 
modules: Web reader interface 50. site driver 51, Lree manager 41, and 
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formatter 42. 

Web reader interface 50. like I/feb reader interface 37 desc 
ribed ahove, interfaces Web printer 17 to Web reader 34. 

Site driver 51 accesses site profile 20 and persona l-news- 
profile 19 and provides data stored therein to Web reader 31 As noted 
above, Web reader 34 uses that data Lo access various Wnti sites and to e 
xtract data therefrom. As noted above, this retrieved data is used by W 
eb printer 37 to build an extracted data tree. 

Tree manager 41 manages the extracted data tree. In this 
regard, truu raanagRr 41 kftRpn track of the organization of the retrieved 
data in the extracted data tree. This allows Web printer 17 to avoid a 
ccessing the same article twine, to avoid unnecessarily re-visiting a We 
b site, and to avoid getting caught in a cycle (loop) in the organizatio 
n of a hypermedia document on the Web. Alternatively, tree manager 41 c 
ould store the data in blocks (as opposed to directly in a data tree) wi 
th reference to a linked list that provides the saree functionality as th 
e extracted data tree. Sample code for tree manager 41 is included in A 
ppendix 3B, 

Formatter 42 is responsible for flattening the extracted d 
ata tree into a linear document and formatting the linear document into 
a personal izKd newspaper. Formatter 42 performs these functions in acco 
rdance with the print criteria and format information (i.e., newspaper t 
emplate) indicated in personal-ne*s-prof i Je 19. Sample code fnr formatt 
er 42 is included in Appendix UH. 

In more detail, Figure 7 is a flow diagram describing how 
Web printer 17 uses ft'eb reader 34 to traverse the Web according lo perso 
nal-nBws-prof i le 19 and Id reLrieve articles fnim the Web according to t 
he profile, excluding unwanted data. 

The Web printer starts in step S700. In step S701. Web pi 
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inter 17 retrieves eitlinr a user designated personal-news profile or a d 
cfault personal news profile stored in disk drive 5 usintf site driver 51 
In this regard, because computer equipment 1 may be used by more than 

uriK user, there may be one or more personal news-profiles stored on the 

equipment, one of which will be designated as the default. Upon rRl.ri r 
ving the dttaigriatHd personal-ncwa-prof i 1c, in step S702 Keb printer 17 d 
eterroines whether any news data has been previously stored to disk drive 

5 (for example, by a previous automatic news delivery) or if news arlic 
3es should be retrieved using perKiinal-news-prof i Id 19. 

In the case that news data does exist on disk drive 5, in 
step S703 the stored news data is retrieved and flow procweda to step S8 
01 of Figure 8, discussed in more detail in the next section. On the ot 
her hand, if no stored news data exists, Web printer 17 invokes Web read 
er 34 in sLep S704. Nut« that thin ia the same Keb reader 34 as discuss 
cd above with respect to defining a personal-news-profile. 

Upon being invoked, Keb reader 34 connects to Web server 3 
S in step S705, which pruvidns a connection to a network, such as the Wo 
rid Wide Web. Web printer 17 then provides Web reader 34 with an addres 
s for the first Web site to be visited based on information retrieved fr 
om personal-news-profile 19. Once connected to the desired Web site in 
step S706, Web printer 17 provides Web reader 34 with commands/links for 

traversing the Web to the next Web page containing information that per 
sonal-news-prof i 1 b 19 indicates should ba retri«VH(l. WrH reader 34 trav 
ersee the Web according to thia information in step S707, 

In step S708, Web reader 34 retrieves the desired informat 
ion and sends it to Web printer 17 according to the rules in persunal-nc 
ws-profile 19. Thus, data exclusion occurs in this step. The rules in 
personal-news-profile 19 specify structural and content-based criteria f 
or excluding data from the personalized newspaper. The structural rulp.a 
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limit the retrieved information on the basis of Lhe structure uf the We 

b site accessed by Web reader 34. The content based rules limit the ret 

rieved information on the basis of its content. As mentioned above with 
r«spH[;t to creating a personal-news-prof ilo, examples of the syntax of 

the retrieval rules in personal-news-profile 19 are included in Appendix 
2. 

In addition to rule-based exclusion, media-type exclusion 
occurs in step S70B, wherein data of a media type that can not be printe 
d is excluded from the extracted data tree. For example, movie and soun 
d data can be excluded. 

Neb printer 17 stores the retrieved data in disk drive 5 ( 
or in main memory 14) in the extracted data tree managed by tree manager 

41. Alternatively, the data could be stored in blocks with reference t 
o a linked list, as discussed earlier. In step S709, Web printer 17 ret 
urns Lo step S707 to complete retrieving all in format inn frnm Wnh pages 
at the Heb site. In step S710, upon completing a traversal of one Web s 
ite, Keb printer 17 uses tree manager 41 to compare the sites remaining 
in personal-news-profile 19 with Lhe site organization information in th 
e extracted data tree to determine if more sites need to be visited. In 

the case that more Web sites need to be visited, step S710 returns flow 

to step 3706 and news articles are reLrieved in the Karrm manner as diHu 
usscd above. On the other hand, if all of the Web sites listed in perso 
nal-news-prof ile 19 have been visited and all of the articles retrieved, 

flow proceeds to step SR01 in Figure 8. 
Flattening and Formatting the Retrieved Data 

Figure 8 is a flow diagram showing how the extracted data . 
tree is flattened and formatted. The configuration of the invention is 
the same as when retrieving data from the Neb (shown in Figure 6). In f 
act, the flattening and formatting processes can occur, at least to a li 
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raited extent, concurrently with thR data retrieval process. 

In step S801 of Figure 8, the extracted data tree is flatt 
ened. This simply means that the orsani zation of Lhe data iy converted 
from an extracted data trp.fi to a linear document, This step provides th 
e opportunity for excluding more data from the personalized newspaper, f 
or example by only including nodes of the data tree into Lhe flattened d 
ocument. This exclusion process is controlled by the I'l attuning rules i 
n persona 1-nfiWR-prof i le ID. 

After the data is flattened into a linear document, the da 
ta is formal. le.d in step SB02 according to the template indicated in pers 
onal -news-profile 19. The definition of this template, which is either 
a pre-defined template or a custom Lemplate, was d i wciiKKBd earlier. Fin 
ally, in step S803, the formatted and fully personalized newspaper is se 
nt to output interface 40. This interface could be printer interface 10 
to printer 7. display interface 11 to display 2, or even modem/fax inte 
rfacc 6, 

Second Embodiment: The HTML Formatter 

The second embodiment of the invention is a system for pro 
causing a hypermedia document. The system accesses the hypermedia docuro 
ent, extracts addresses from the hypermedia document, and stores the add 
resaes extracted from the hypermedia document in a container. The syste 
m activates a processing function to process data stored at the addresse 
s stored in the container, downloads the data stored at lhe addresses st 
ored in the container into a memory, and extracts predetermined data fro 
fa downloaded data in accordance with predetermined configuration inforroa 
tion. The predetermined data is then formatted in accordance with prede 
fined formatting settings to generate a formatted document, and the form 
atted document is processed in accordance with the processing function. 

The second embodiment of the invention is depicted as HTML 
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formatter 18, noted in Figure 2. An example of HTML formatter 18 is We 
bFormatter, manufactured by Canon Information Systems, Inc. The second 
embodiment will be described with respect to WebFormatter. It should be 
noted, however, that HTML formatter 18 is not limited to the WebFormatt 
er embodiment, and that various alternative embodiments within the spiri 
t and scope of the following description are possible. 

WebFormatter is stand-alone utility software that can he u 
sed in conjunction with different Web browsers, such as Netscape, Mosaic 
and Internet Explorer. In short, WebFormatter extracts data from ,*i Wnh 
page, strips out extemporaneous data from the extracted data, and refor 
mats the data into a formatted document. The forma Lted documenL can the 
n be printed, stored in an RTF (Rich Text Format) file, or edited in any 
RTF compatible editor, such as MS Word, WordPerfect, Wordpad, etc. 

WebFormatter can be activated from a windowing environment 
, such as Microsoft Windows (r). Prom such a windowing environment, WebF 
orraatter can be activated by double-clicking on a WebFormatter icon (not 
shown) in a start-up window, selecting KebFormatter from the Windows st 
an menu, dragging a UR1. (unifurrn resmncr; 1 fixator) icon (not nhown) fro 
m a Web browser and dropping it into the WebFormatter icon, or by automa 
tically invoking WebPormatter when the Web browser is started. 

Unlike the first embodiment ul thn invent inn described ahn 
ve, WebFormatter does not use a predefined personal-news-profile to spec 
ify criteria for creating a particular type of document from one or more 
Web pages. Rather. WebFormatter relies upon user-specified criteria to 
creaLe a particular type of document, such as a newspaper or the like, 
from one or more Web pages. These criteria are input interactively by a 
user via a graphical user inLerface, 

As described in more detail below, WebFormatter operates i 
n two modes - a minimized mode and a fully-functional mode. In the mini 
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mized mode. MebFormatter* s graphical user interface is essentially a flo 
ating print button, which is displayed concurrently with displayed Web p 
ages, By virtue of this feature, as a user explores the Web, the user c 
an process, format, and print out Web pages by merely clicking on the fl 
oating print button. 

[n its fully-functional mode, WebFormatter* s graphical use 
r interface provides spaces for a user tn enter a UHL address of a Web p 
age to be processed, enter a personal title for the document, select a f 
ormat for the document, preview a formatted first page of the document, 
and either print the document, save the document as an RTF file, or view 
/edit the document using an RTF editor. The graphical user interface fo 
r the fully-functional mode will be described first, since it is from th 
at interface that the user can enter the minimized mode. 

Figure 9A shows graphical user interface 43 for WebFormatt 
er* s fully-functional mode. Graphical user interface 43 is displayed on 

display 2 upon first activation of Webpormatter. As with any interact! 
ve windowing software application, a user interacts with graphical user 
interface 43 by means of mouse 4 (by pointing and clicking) and knyhuani 

3. 

As shown in Figure 9A, graphical user interface 43 include 
s fields 44 and 46 to 49, through which a user can specify the URL addrc 
as of a document to be formatted and the format of that document. Begin 
ning with URL field 44 a user enters the URL address (e.g., http://w*w 
. cis. canon, cora/t is/tis_home. htm) of a Web page Lo be processed by WehFor 
matter. There are several different ways for the user to enter the URL 
address. The user can (1) type the address directly into URL field 44, 
(2) copy the URL address in the Web browser and pasts the LRL address in 
to URL field 44 (3) drag the URL address from the Web browser onto grap 
hical user interface 43 or onto the WebPorinatter icon, or (4) click on C 
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urrent URL button 54. 

With regard to Current URL button 54, if a user clicks on 
Current URL button 54, WebPorraatter locates the active Web browser and q 
ueries Ihe Web browser for Llie address uf the current Web page. Thereaf 
ter, the Web browser provides the address of the current Web page to Web 
Formatter, which places the address in URL address field 44. If URL but 
ton 54 is activated and no Web browser is currently running, KebKorraattc 
r displays dialog box 56, shown in Figure 9A. 

As shown, dialog box 5G includes Cancel button 57 and Laun 
t;h Browsnr button 59. Lanr.el button 57 cancels a user's request to inpu 
t a URL address into URL address field 44 via Current URL button 51 La 
unch Browser button 59, on the other hand, launches a Web browser specif 
ied in WebFormatter. As noted below, WebFormatter is configured beforeh 
and with predefined information including a Web browser to be used with 
WebFormatter. Configuration of WebFormatter will be described in more d 
etail below. 

[n alternative embodiments of WebFormatter, a filename can 
also be entered into URL address field 44. For example, in these alter 
native embodiments, if a user wishes to format a hyper linked manual int 
o a book-like format, the user enters the filename into URL address fiel 
d 44. Thereaf ter, WebFurmattor proceeds through the file in the same raa 
nner as through specified Web pages in order to reformat the byper-linke 
d manual as desired. 

Returning to graphical user interface 43, title field 46 e 
nables a user to enler a personalized title fur a formatted document. T 
he title may be typed directly or pasted into title field 46. 

Formatting fields 47 to 49 define the format of a document 
to be output by WebFormatter. Options for thfl di f fnrflnt formatting fie 
Ids can be accessed by clicking on a scroll bar, such as scroll bar 55, 
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of a respective formatting fi«ld. Kadi of these fields is described in 
detail below. 

Styles field 47 provides four options for formatting an ou 
tput document. These styles relate to characteristics of an output docu 
ment such as size of headers, margins, etc. The style options include C 
ontemporary, Formal, Pun and Professional. Thn invention, of course, is 

not limited to these four style options, and other styles can be added 
as desired. 

Columns field 48 defines the number of columns in a format 
ted output document. Two columns options are available - Single and Mul 
tiple; however, the invenlion is not limited tu thesH two options. Thp. 
Single option, as might be expected, formats the document into a single 
column. The Multiple option, on the other hand, formats the document in 
to a predetermined number of columns. In preferred embodiments of the i 
nvention, the multiple option is set to two columns; however, any number 
can be set. 

Spacing field 49 defines the spacing betwenn lines in a fn 
rmattcd output document. Three options are provided in WehPormatter, bu 
t other options can be added as desired. These three options are Conden 
sed, Normal and Easy To Read, *ith Condensed btiing the l^aat amount nf s 
pacing between lines and Easy Td Read being the most amount of spacing b 
etween lines. 

Graphical user interface 43 is also provided with Preview 
button 60. By clicking on FreviRw button 60, a user can preview a first 
page of a formatted document in viewing area 61. An example of a previ 
ewed formatted document is shown in Figure 9A. 

As shown in Figures 9A and 9R. WebFtirmatter .ilso includes 
Uptinns button 81. Options button Bl provides a user with additional fo 
rmatting options which. are used by WebRormatter to create a formatted do 
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rumen L. A user nan activate Options Iiiittnn til hy clicking thRreori. Thi 
s causes Options dialog box 62, shown in Figure 9B, to appear on display 

o 

6. 

As shown in Figure 9B, options dialog box 62 includes Gene 
ral options 64, Container options 66 and Strip Meta Info options 67. Ge 
neral options 64 includes "Text only 0 listbox 72, "Index of links in the 
page" listbox 73, and "No floating pictures" listbox 74. These options 
are indicated as being selected by a check mark or the like in a respec 
tive lislbox. As will become clear from Lheir descriptions, morn than o 
nc of the options in General options 64 can be selected at the sane time 

"Text only" listbox 72 instructs WebFormatter to strip all 
graphics in a Web page and print only text therein, "Index of links in 
the page" listbox 73 instructs WebFormatter to add a list of all URLs p 
resent in a Web page or pages to the end of a formatted document. Pret'e 
rably, the list of URLs is printed as superscript, and anchor positions 
of the URLs in the list are marked in bold. "No floating pictures" list 
box 74 instructs WebFormatter to print all images in the document in a p 
articular area of the formatted document. In some cases, therefore, whe 
n this option is selected. WebFormatter shrinks images, as needed, so Lh 
at images fit into a particular area. 

Strip Meta Info options 67 provides engineering options wh 
ich facilitate stripping of unnecessary information from a Web page bein 
g processed by WebFormatter. The options include (1) "None", which inst 
ructs WebFormatter to strip nothing from the Web page, (2) "Till the fir 
st horizontal rule", which instructs WebFormatter to strip all links and 
images until and up to predefined first and second horizontal formaLLin 
g rules (e.g., up until a horizontal line across a page), and (3) "Till 
the first text", which instructs WebFormatter to strip all links and itna 
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ges up to first and last occurrences of text in the Web page. Only one 
of Strip Mcta Info options B7 can be selected at a time. Selection thcr 
eof is indicated by a dot in a bullet located next to an option, as show 
n in Figure 9B. 

Container upturns 66 provides opt ions for processing dot ; urn 
ents, addresses for which are stored in container 76 shown in Figure 9B. 

Prior to describing Container options 66, a description of container 7 
6 will be provided. 

As noted, container 76 stores URL addresses of selected do 
cuments. Document addresses which are input to field 44 are added to co 
ntainer 76. The order in which LRLs are input into container 76 denotes 
the order in which data in the LRLs is processed by WebFormatter. As s 
hown in Figure 9tt once container 78 becomes full, its icon changes to t 
hat shown by reference numeral 77. 

When a user clicks on the icon for container 76) menu 77 i 
s displayed. Menu 77 provides five options; i.e., Open 79. Enpty 80. Pr 
int 81, Edit 82 and Save 84. These options are highlighted when activat 
ed, and are described in detail below. 

Dpcn 70, when activated, displays Container Contents scree 
n 87 shown in Figure 9B. Container Contents screen 87 shows the URL add 
resses stored in container 76. Container contents screen 87 provides fo 
ur buttons; i.e., Add current URL buLton 88 which adds the current URL I 
o container 76, Delete button 89 which permits a user to highlight and d 
elete a URL in container 76. Empty button 90 which permits a user to emp 
ty container 76, and Hone button 9L which permits a us«t to close Contai 
nfir Contents screen 87. It is noted that a user can also empty the cont 
cnts of container 76 by clicking on Bmpty 80 of menu 77. 

In addition, the user can rearrange the order of URLs stor 
ed in container 76 hy dragging rind dnipping different URL3 at different 
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locations Lherein. As noLed aboi>K, since the UKLx arn prnnp.ssed in the 
order that they appear in container 76, this feature permits a user to r 
earrange the 

piucessing order uf the URLs in container 76 interactively. 

Print 81, Edit 82 and Save 84, fchen activated, cause WebFo 
matter to download all data at Web pages defined by the NRLs stored in 
container 76, format them as specified by the user, create RTF file(s) s 
toring the formatted Web pages, and do the selected action, i.e., save, 
edit or print the RTF file(s). This process is dtiscr ibed in greater det 
ail below. 

Referring back to Options dialog box 62, Container options 
66 include "Print tabic of contents" listbox 92 and "Brapty after proces 
sing" listbox 91 As shown, a check mark appears in a listbox to indica 
te that the listbox has been selected. In this regard, more than one li 
stbox can be selected at a time. "Print table of contents" listbox 92, 
when selected, instructs WebFormatter to print titles of all L'RLs in con 
tainer 76 as a table of contents in a formatted output document. "Bmpty 
after processing" lintliox 94, when activated, instructs WebFormatter au 
tomatically to empty container 76 after printing, editing or saving a do 
cument, without waiting for a user to do so. 

Also shown as part of Container options 62 are Select RTF 
Editor hntton 69, Cancel button 70 and DK button 71. By clicking on Sel 
ect RTF Editor button 69, a user can select an RTF file editor, examples 
of which are noted above. This can be done, for example, by displaying 
another dialog box listing predefined RTF editors (not shown) and selec 
ting one of the predefined RTF editors. Cancel button 70 cancels Contai 
ner options G2 and OK button 71 confirms selected options in Container o 
ptions 62 and then cIoarr its dialog bon. 

As shown in Figure 9B. graphical user interface 43 also in 
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eludes print icon 96, ediL icon 9.7. savn ium 99. hnlp button 100, dorm 
button 101 and minimizing icon 102. A user may select an* of these feat 
ures by clicking thereon using a mouse. 

Print icon 96 opnriH a print dialog box (not shown), which 
allows a user to print any number of copies of Web pages formatted by We 
□Formatter. Edit icon 97 opens an RTF file storing formatted Web page(s 
) for editing by a predetermined RTF editor. Save icon Ofl opens a save 
dialog box (not shown), which allows the user to name and save a formatt 
ed Web pege as an RTF file. Help button 100 provides help messages for 
operating WebFormatler, and Duns button 101 exits fnim WebForraatter. Mi 
nimizing icon 102 activates the minimizing mode of Wehforraatter which wa 
s mentioned above and which is described in greater detail below. 

Figure 9C shows menus provided by WebFormatter during its 
operation. These menus include file menu 103, edit menu 104 and window 
menu 106. File menu 103 provides "Save". "Rdit" and "Print" options, th 
o functions of which arc identical to those of Save icon 90, Edit icon 9 
7 and Print icon SB, respectively, An "Exit" option is also provided to 

exiL frora File menu 103. Finally, Pile menu 103 provides "Open HTML fi 
le" option 107. This option provides a user with the capability to open 

a local HTML file; i.e., a hypermedia file resident on the user's compu 
ter such as a file saved from NetScape, or URL files created by dragging 

and dropping a URL onto the windows desktop. "Open HTML file" option 1 
07 also provides hooks needed to open files created by other Web-file-pr 
ocessing products so that those fiJes can be forraaLLed as RTF files and 
printed, saved and/or edited using WRhFormnttor. 

Edit menu 104 provides "Paste URL" option 109. "Paste URL" 

option 109 pastes the contents of a paste buffer, such as a URL address 

copied from a Web page, into KRL field 44, as rlfisnrihfid above. 

Window menu 106 provides a "Help Topics" option which prov 
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ides a usnr with information regarding the use, maintenance and backgrou 
nd of WebPormatter, and an "About WebPormatter* option which provides a 
user with a dialog box (not shown) containing WebFormatter* s version num 
ber and copyright notice(s). Window menu 106 also includes "Preferences 
"option 110. "Preferences" option 110 opens preferences dialog box 112 
, shown in Figure 9D. 

Preferences dialog box 112 is used to. configure and re-con 
figure VlebFormatter. As shown in Figure 9D, preferences dialog box 112 
includes Minimize view options 113. General options 114 and WWW Browser 
to use options 115. Minimize view options 113 can be 9et to configure W 
ebFormitter' s graphical user interface in the minimized mode. Two sets 
of options are provided. The first set include "Print", "Bdi t" and "Sav 
e". These options correspond to print icon 96, edit icon 97 and save ic 
on 99, shown in Figure 9B. When a check mark appears in a listbox next 
to one of these options, the icon for that option is displayed in the mi 
nimized mode, e.g., the print icon, the edit icon and/or the save icon. 

More than one option can be selected at once. In this regard, Figure 
9H shows graphical user interface 116, which is a representative example 
of a graphical user interface for WebFormatter when WebFormatter is in 
the minimized mode. 

Referring back tn Figure 9U, Minimize view op 
tions 113 also include "Row" and "Stack" options. These options can be 
set to display WebFormatter' s graphical user interface in the minimized 
mode horizontally by selecting "Row" ur vertically by selecting "Stack". 
Only one of these options can be selected at a time. As an example of 
the foregoing, graphical user interface 116 corresponds to a row of ico 
ns. 

WWW Browser tD use options 115 determine which World Wide 
Web browser is to be used with WebFormatter. As shown, preferably NetSc 
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ape, Internet Explorer and Mosaic are provided as browser options; howev 
er, other browser options can also be provided. As might be expected, o 
nly one of these options can be selected at a time. The default browser 

option is NetScape Navigator. 

General options 114 include "Auto-start with browser" opti 
on 117, "Open in minimized view" option 118, "Warn before printing more 

than pages" option 119, and "Warn before saving more than MBs" opt 

ian 120. "Auto-start with browser" option 117 sets WebFormatter to be i 
nvoked automatically when a Web browser is activated. If thia option ih 

not selected (which is the default), WebFormatter is opened by double c 
licking on a WebFormatter icon in the windowing environment, selecting W 
ebFormatter from the Windows start menu, or dragging and dropping a URL 
from the Web browser into the WebFormatter icon, as described in more de 
tail above. "Open in minimized view" option 118, when selected, opens W 
ebFormatter in minimized mode. The default however, is the fully-funct 

ional mode. "Warn before printing more than pages" option 119, and " 

Warn before saving more than MBs" option 120 allow a user to control 

the number of pages saved of a formatted document and the amount of memo 
ry space used by those pages, respectively. The default for both of the 
se options is for no warning to be given. As is evident, more than one 
of the general options can be selected at the same time. 

Preferences dialog box 112 also includes cancel button 121 

which cancels a user's selected preferences and DK button 122 which con 
firms a user's selected preferences. 

As explained above, WebFormatter can be ccinfi gurnt! to erite 
r directly into the minimized mode via Preferences dialog box 112, or a 
user can enter the minimized mode via minimizing icon 102 shown in Figur 
e 90. As also noted above, Figure 9E shows an example of graphical user 

interface 116 for WebFormatter in the minimized mode. Graphical user i 
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ntcrfacc 11G is displayed as a floating interface while a user is explor 
ins the Web. Thus, as a user views a Web page, the user also views grap 
hical user interface 116. By clicking on an appropriate icon on graphic 
al user interface 116 (which, in Figure 9E, includes icons identical in 
both structure and function to those shown in graphical user interface 4 
3). the user can capture the current Meb page, process and format the We 
b pape into an RTF file, and save, edit and/or print the RTF file. Aite 
rnatively, the user can drag a URL from the Web browser and drop it intu 
one of the icons. 

A user can reconfigure HebFornatter in the minimizing mode 
by double clicking a right mouse: tmttnn. This action causes a preferen 
ces dialog box to appear on display 2 which is identical to preferences 
dialog box 112. Thereafter, the user can alter the configuration of Web 
Formatter as desired. Should a user wish to enter the fully-functional 
mode from the minimizing mode, the user need merely click on maximizing 
icon 117 shown in Figure 98. 

Figure 10 is a flow diagram describing the operation tiT Wh 
hFnrmattcr. WebFormattcr is activated in step S1000. As described abov 
e, this can be done by double-clicking on a WebFormatter icon in a windo 
wing environment. Depending upon how WebFormaLLer has been c.nn figured, 
i. k, in the fully-functional mode or the minimizing mode, either a grap 
hical user interface similar to that of graphical user interface 43 or o 
ne similar to Lha L of graphical user inLerface 116 is displays! in Rtep 
SIOOU Kur the sake of completeness, the following assumes that a graph 
ical user interface similar to that of graphical user interface 43 is di 
splayed in step S100Q, since the default mode of NebFormaLler is Ihe ful 
ly-f unctiunal mode. 

Next, in step S1001, NebForraatter is configured, as descri 
bed above via preferences dialog box 112 and opLions dialog Lux 62. Thi 



#fffl¥-l 0-254912 



g«M = 3 6 P 1 1 1 J (4 5) 

s step is not necessary unless a user wishes to change NfihKormatter' s pr 
eviously set configuration. In step S1Q02, document format data is inpu 
t in fields 44 and 46 to 49 described above. More specifically, the use 
r inputs a URL (or filename in alternative embodiments) into URL field 4 
4. As described below, WebFormatter uses this information to process We 
b pages stored at the URL to create an RTF file based on the configurali 
on of WebFormaLLer and Llie data input in fields 46 to 49. 

In step S1003, a Web reader similar to that of Web reader 
34 described above is executed. The Web reader connects to a network, s 
uch as the World Wide Web, in step SI 00.4. 

Next in step S1005, it is determined whether a URL or a fi 
lenatne has been entered. As described above, in preferred embodiments o 
f WebFormatter, only a URL may.be entered. However, since alternative e 
mbodiments of WebFormatter may permit entry of a filename, a description 

of processing a file other than one at a URL address will be provided. 

If a URL has been entered in field 44, processing proceeds 

to step S1006. In step S1006. the Keb reader accesses the hypermedia d 
ocument (e. g. , a homepage) specified by the I.IHL address. In step S1007, 

WebFormatter instructs the Web reader to traverse the hypermedia docurae 
nt. Thereafter, WebFormatter selects URL address(es) from the Web and s 
tares the addresses in container 76. Once all desired addresses have he 
en selected and a processing function, such as print, has been activated 
, WebFormatter downloads data stored at the addresses in container 76 in 
to memory 5. WebFormatter then extracts predetermined data from the dow 
nloaded data based on the configuration information set in Optional dial 
og box 62, and stores the extracted data in memory 5. Thus, for example 
, if "Text Only" option 72 in Options Window 62 is on, only text is e>ctr 
acted from the downloaded data. Processing then proceeds to step S10L1. 

Un the other hand, if, in step S10O5, a filename for an HT 
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ML source file is entered, WebPormatter instructs the Web reader to accc 
ss a first site in the file. In steps S1008 and S10D9, the site is trav 
ersed and data is extracted and stored in the same manner as in step S10 
07, described above. Then, in step S1010, WebFormatter determines if mo 
re sites are listed in the HTML source file. If more sites are listed i 
n the file, flow returns to ste^p S1008, and the next site is accessed. 
If no more sites are present, processing proceeds to step S1011, 

In step S101L WebForraatter processes the extracted data i 
n accordance with the previously set fornat information. For example, i 
f Columns field 4B is set to multiple, the extracted data will be format 
ted into a document having multiple nnlunns. The above processing is in 
itiated by activating one of Print icon 96, Edit icon 97 or Save icon 99 
, and is similar to the processing described above in the first embodime 
nt, e.g., flattening the document and formatting the document based on t 
Mr lurmatting i nf nrrnat i on. Accordingly, a detailed description thereof 
is omitted for the sake of brevity. 

Once the documents whose URLs are stored in the container 

have been downloaded, formatted according tti the preset formats and con 
figurations, and converted into RTF file(s) in step S10U, in step S1012 
, the RTF file(s) are output. Alternatively, the RTF files(s) can be ed 
ited or saved, depending upon which icon nn thfi graphical user interface 

has been activated. 

The invention has been described with respect to particula 
r illustrative embodiments. It is to be understood that the invention i 
s not limited Lo Lhe above described embodiments and modifications there 
to, and that various changes and modifications may be made by those of o 
rdinary skill in the art without departing from Lhe spirit and scope of 
Lhe appended claims. 

APPENDIX 1 



W¥l 0-254912 



gg*^= 3 6 P 1 1 1 J (4 7) 

SAMPLE USER PROFILE 

The User Profile is implemented in wtndows.ini file furmat. 
[Defaults] 
Count-4 

Title=My Daily Paper 
[1] 

Heading-News In Brief 
Site=l 

Section=Front Pa&e 

MaxLevels^i 

Maxpages=10 

MaxKBytes=2000 

llaLe=loriay 

Print=level 0 

Tempi at e-1 

[2] 

HcadLng=Sports In Brief 
Site=2 

Sect ion=5ports 
Max Level s=0 
MaxPases=10 
&laxKbytes=200 

KeywordPiltBr= w Footliair AND "49ers" 

Datc=today 

Print=level 0 

Template-] 

[31 

Heading=Money Matters 
Site-1 
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S«r;t ion=BiiR i n rsr 
MaxLevels=I 
MaxPages-lfin 
UaxKBytcs=20000 

KeywordIHlter= u Coraputer° OR "hardware" OR "Software" 

Date=today 

Print=all 

Tempi ate=2 

14] 

Heading=Sri Lanka 
Site-3 

Se.cti tin-Hi)tNeu.s 
MaxLevels=l 
MaxPages-100 
MaxKBytes=2000G 
Date=today 
Pr int=leaves 
Template^ 
SAMPLE SITE PROFILES 
#Legend: 

#ftW-day of the week 
t%s section part of URL 
[Defaults] 
Count=3 

CI] 

Title=San Jose Mercury News 
User name-row ickr am 
Pasaword=canni)n 
StartData=StartHeadlines 



(86) 



3 6 P 1 1 1 J (4 9) 

EndData-EndHeadl ines 

Home Fage=http://www. sjtnercury. com/ 

Section[]RL=http://www. sjmfircury. com/ftS. htm 

SectionCount=9 

Section 1-Front Page 

Section 2=International 

Section 3=National 

Section 4-local & State 

Section 5=kii3i turials Commentary 

Section 6=Business 

Section 7=Sports 

Section 8=Li v ing 

Section 9=Cntertainment 

[1. Sections] 

Front Page=front 

In ternat iurml=intl 

Nat ional=notl 

Local & State=loc 

Rditoriale & noratDentary=edit 

Business=biz 

Sports=spts 

Living-1 iv 

entertai nment^Rnt 

L2J 

Title=The San Francisco Chronicle 
Home Page=http://www. sfgate. cora/chronicle/ 
SectionURL-"http:www. sfgate. com/cig-bin/chronicle/article 
list. cgi?K/S:/chronicle/today" 
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SecLion CDunL=5 
Section l=News 
Section 2=Business 
Sttc;tinn 3=5>purts 
Section 4=Bditorial 
Section 5-Datebook 
[2. Sent ions] 
News=News:MN 
Business=Busineee:BU 
SjJurts=sports:SP 
Bd i t or i a l=Bdi tor i a 1 : ED 
Datebook^Datebook:DD 
W 

Title=The Day News 
Home page= 

http://www. landa. net/lakehouse/anclWeb/dailynew/ 
3ectionURL="http://www. lanka, net/lakehouse/anclWeb/dai lyne 

w/JW/MS. html" 

SectionCount =12 
Section EBusiness 
Section 2-Editorial 
Seutiun 3=Features 
Section 4=Foreign 
Section 5~Letters 
Section 6=lnBrief 
SRf;tinn T=HotNews 
Section 8-Probes 
Section 9=Ni I i tary 
Section 10=Politins 
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Section ll=0bi tuaries 
Section 12=Sports 
[3. Sections] 

Business=business/intro 
Editor iai=edi tori a 1 /final 
Features=J'satu r ns/i ntro 
Fareign=foreign/intro 
Letters=letters/f inal 
InBr icf=inbr ief/intro 
llotNews=hotnews/intro 
Probes=proves/intro 
Mil itary=mi ] i tary/i ntro 
Politics=pol itics/intro 
Dili luar ies=obi Lurai/intro 
Sports=sports/intro 

APPENDIX 2 

SYNTAX FDR RETRIEVAL EXTRACTION AND PRINTING CRITERIA 
Maximum levels to search: MaxLcvcls=<#> 
-1: to retrieve all levels 
0-n; (.o retrieve up Lo n levels 
Maximum pages of the document: MaxPagcs=<#> 

n: final document not more than n pages 
Maximum size of the document: MaxKDytes=<#> 

n: document size not tnare than n kilo bytes 
Exclusion rules: 

Date=today | lessthan c(> 

today: retrieve only articles posted today 
lKssthan <f>;n: retrieve tirily articles no 
more than n days old 
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Reti isvb=al ] | naisubd i r | nothisdir | thissitconly 

all: allow to fetch pages from other sites 
nosubdir: hxc;1ui]h UKLa to subdirectories 
nothisdir: exclude URLs in this directory 
thissiteonly : fetch pages from this site onl 

y 

Keyword search: 

KeywordFilter-<keyword> (AND | DR | NOT) <keyword>: 

accumulate only pages containing the combina 

tion of keywords 

KeywordRank~<#>i n: use fuzzy logic to rank 

pages according to keyword combination in Ke 
ywordFilter and keep top n ranked pages 

KeywordAuthor=<author>: accumulate only 

pages authored by author 
ExcludeType=ads InonEngl ish 

ads: exclude advertisements 

nnnHngl tsli: exclude articles that arc not in 

English 

Flattenins rules: Print=al 1 | leaves | level-<#> 

all: include all nodes in the tree in the linear doc 

ument 

leaves: include all leaves in the tree in the li 

near document 

level=<#>;n: include up to nth level of the tree in 
the linear document 
Formatting rules: Template-<#> 

n: print according to default or user template num 

ber n 
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APPENDIX 3 
DESCRIPTION OF MODULES 
Appendix 3 A 

THE PERSONAL NEWS PROFILE EDITOR 3J0DULB 

The Profile Editor manages access to the user profiles and is represente 
d by CProf i JeMgr class. It also manages loading and saving of the profi 
* les. The services provided by Profile Bditor are: 
BOOL NewProfile(CString fileName); 

Creates a new profile given the 



file name. 



ile entry. 



ile entry. 



BOOL OpenProfileO; 

Opens the default profile. 
BUUL OpertProf ilc(CString fileName); 

Opens the named profile. 
CProf ileEntry* KetHi rstHntr y 0 ; 

Loads and returns the next prof 

CProf ileEntry* GetNextEntry 0 ; 

Loads and returns the next prof 

BOOL WriteEntry (CProf ileEntryS entry); 

Saves a new nntry in the prof i 1 



Each profile entry contains an extraction specif icaLion and an output sp 
ecitication as represented by CProf ileEntry class. The methods provided 
are: 

CURL GetSiteldO; 

Returns the site id contained i 

n the profile entry. 
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CBxtractionSpec GetExtractionSpec 0 ; 

Returns the exLracLion specific 
ation contained in the profile entry. Extraction specification contains 
keywords for searching limits for levels, pages, size in kilo bytes. 
COutputSpec GetOulpulSpecO ; 

Returns the output specificatio 
n contained in the profile entry. Output specification contains formatt 
ing instructions and tree traversal rules. 
THE Web READER MODULE 

CWebPage class abstracts the interface to the Internet browser and is re 
presentative of the actual Web page. It will hp. responsible for fctchin 
g a Web pagc ; extracting links or references to other URLs in the Web pa 
ge, and maintaining the contents of a Web page. The methods provided ar 
e: 

BOOL LoadO; 

Fetch the Web page using the UR 

L, username and password. 

BUUL Parse 0; 

Parses the data in the Web page 
and creates a list of links. Also resolves the relative URLs into ahsn 
lute URLs. 

CURLList* GetLinksO ; 

Returns the list of links in th 

e Web page. 

CPa&eData* GetDataO ; 

- Returns the actual tnxi data co 

ntained in the Web page. 

void FiltcrContentO ; 

Extracts title and other inform 
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at ion according to the site data. 

CString GetTitleO; 

Returns title and other informa 

tion according to the site data. 

CString GetAuthorO : 

Returns the author of the Kcb p 

age ; 

int GetSize 0 ; 

Returns the size of the data in 

kilo bytes. 

CNetwork class will encapsulate OLE functionality and provides communica 
tion with the InlerneL browser. 

CString GetllsernameO ; 

Determine the currently set use 

mame. 

.void SctUscrname(LPGTSTR); 

Set the current usernarae in the 

CNetwork object. 

CString GetPasswordO ; 

Determine the currEnlly set pas 

sword, 

void SetPassword(LPCTSTR); 

Set the current password in the 

CNetwork object. 

void Close 0 ; 

Disconnect any active connectio 

n and reset the CNetwork object. 

short Read(BSTH#pBuffer, shortiAmount) ; 

Read data retrieved by the Brow 
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ser. 

long GetStatusO ; 

Query the status of the current 

1 nad. 

BOOL Dpen(LPCTSTR pUHU short iMethod, LPCTSTR pPostData, 1 
ong IPostDataSice, LPCTSTR pPostHeaders) ; 

Initiates the retrieval of a UR 

L from the network. 

CString GetBrrorltessageO ; 

Provide the caller with interna 

lly generated error messages. 

short GutSKrvttrStatusO ; 

Determine the error status repo 

rted by the server. 

lone GetContentTypcO ; 

Return the content length (tota 

1 amount of bytes) of the current load. 

CString GetContentBncodingO ; 

Return Lhe MIMB encoding of the 

current load. 

CString GetExpi rfss () ; 

Return when the data retrieved 
by this load is no longer considered valid, 

CString Resolve (LPCTSTR pEaoe, LPCTSTR pRclativc) ; 

Generate an absolute (fully qua 

lified) URL. 

BOOL IsFinishedO ; 

Determine if a load is complete 
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shorl By LesReady () ; 

Inform the caller of the number 

of byte9 prepared to be read. 
THE SITE DRIVER MODULE 

The Site Driver will provide the site information to the Web Reader. Th 
e Site Driver is functionally similar to the Profile Bditor and is repre 
sented hy USitfiDriver class. Services provided are: 
BOOL NewProf ile(CString fileName); 

Creates a new profile given the 



file name. 



e entry. 



entry. 



BOOL OpenProfileO ; 

Opens the default profile. 
BOOL OPcnProfile(CString fileName); 

Opens the named profile. 
CSileProfile* GetPirstSi tnO ; 

Loads and returns the first sit 

CSiteProfile* GelNexiSi te 0 ; 

Loads and returns the next site 

DOGL WriteEntry(CSiteProfile8c entry); 

Saves a new entry in the prof i 1 



c. 



int NuraberDf Si tes() ; 

Returns the number of sites spe 

cif ied in the prul i 1b. 

An entry in the site profile will contain information about the base URL 
of the site, title of the news source, information about Ituw to access 
the site, and various other information such as section data etc. and wi 
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11 he rRprRnnnted by USitcBntry class. Methods provided are: 
CString GetURLO ; 



Returns the base URL of the sit 



te. 



te. 



ource. 



CString GetUsername () ; 



CString GetPasswordO ; 



CString GetTitieO; 



CSLring GetTitieO ; 



int Sect ionCount 0 ; 



Returns the usernarae for the si 



Returns the password for the si 



Returns the password for the si 



Returns the title of the news s 



Returns 



Appendix 3B 

TREE MBAGBR MODULE 

Tree Manager will maintain the most, central data structure in this progr 
am, which is a tree of Web page nodes and is represented by the CPageTre 
e. CPageTree will traverse the WWW to retrieve the necessary Web pages 
according tu thR extraction specification and builds the tree. The meth 
ods provided arc: 

CPageTreeNode* GetRootO; 

Returns the root node of the tr 

ee. 

BQDL Bui ld(CURL URL, CBxtractionSpecfc spfsn) ; 
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Builds the tree according to th 
e personal news profile extraction specification. 

Bach node in the page tree is repreeenled liy a CPagnTreeKodc. Mcthod9 p 
ruvidcd are: 

DOOL AddChild(CWebPage* page) ; 

Adds a child node with Web page 

data. 

CWebPaga* GetPaeeO ; 

Returns the Web page contained 

in the node. 

int NumbRrUt'ChildrenO ; 

Returns the number of children 

belonging to the node. 

BUUL InLeaf <>; 

Returns TRUE if a leaf node, L. 

e. , no chi ldren. 

To Lraverse Lhe Web page tree, a CTrccItcrator class is defined with dif 
fcrent traversal methods. Methods provided are: 
void Reset () ; 

Cancels the nurrent transferral 

, and initial izes state data, 

CPage TreeNode* GetNextNode 0 ; 

Returns the next node in the tr 

ee in a depth first search. 

CPage TreeNode* GetNextSibling () ; 

Returns the next node in the tr 

ee in a breadth firsL search. 

CPage TreeNode* GetNextLeaf 0 ; 

ReLurns the next leaf in the tr 



%fffl¥-l 0-254912 



OT ara=3ffPi 1 1 j (5 o) 

ee in a depth first search. 
THE FORMATTER MODULE 

Input to this module will be the Web page tree created by the Tree Manas 
er and the output specification conLained in the user profiln, Formatte 
r will Lraverse the tree according to the rules specified in the output 
specification and the final document will be formatted using the forcoatt 
ing instructions in the output specification anri thfi formatting containe 
d in thR Wen pages such as headings, paragraphs and lists etc. 
The output document will be in Rich Text Format (RTF) and will be access 
ible by many applications. RTP is a advanced formatting language for te 
xt, providing document, section and paragraph formatting, style sheets, 
headers and footers, and with support lor Uniconde. Image formats suppo 
ited are UiB. DDB, WMF, OS/2 metafiles. There is no support for Web iraa 
ges which are of the CIF format. A third party library will need to be 
purchased in order to do the conversion of the GIF to DIB format or one 
can be developed iri-house. 

The prototype creates a HTML file as the output. 

The formatter is represented Uy tlm LKorntatter class. The methods provi 
dm! arn: 

BOOL QpenHTIKLPile (CSt ring fileName); 

Opens the named HTML file for o 

utput. 

void CloseHTMLFileO; 

Closes and saves the HTML f i 1 r. 
BOOL PrintHTML(CPageTree& mot, COutputSpecS format); 

Given the root and the output s 
pecif ication, traverses the tree and prints the contents in the Web page 
s in HTML format. 

BUUL UpenRTFFiMCStrmg fileName); 
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Opens the named RTF file for ou 

tput. 

" void CloseRTPFileO ; 

Closes and saves the RTF file. 
BOOLPrintRTF(CPageTree& root, COutputSpecS format); 

Given the raoL and the output s 
pecif ication, traverses the tree and prints the contents in the Web page 
s in RTF format. 

BQDL Print (CPageTree& root, COutputSpecfc format); 

Given the root and the output specification a 
t, traverses the tree and prints the contents in the Web pages to the de 
faul t pr inLer. 

4. BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a perspective view showing Lhe ouLward appeara 
nee of the personal news retrieval system according to the invention. 

Figure 2 is a block diagram of the personal news retrieval 
system shown in Figure 1, 

Figurn 3, comprised of Figures 3A, 3B, 3C and 3D, shows re 
presentational diagrams illustrating an example of the transf ormation of 
information from the Web (Figure 3A) to an extracted data tree (Figure 
3B), then to a flattened document (Figure 3D, and finally to a formattc 
(1 diinumfint (Figure 3D) according to the intention. 

Figure 4 is a representational block diagram of the manner 
by which a personal-news-profile for retrieving news articles via the W 
eb is created or edited according to the invention. 

Figure 5, comprised of Figures 5A and 5B, . shows flow diagr 
aras describing how a personal-news-profile is created or edited. 

Figure R is a representational block diagram of the manner 
by which news articles arc retrieved from the Web and formatted with re 
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ference to a personal-new-profi acnnrrling to the invention. 

Figure 7 is a flow diagram describing how news articles ar 
e retrieved from the Web with reference to a personal-news-profile. 

Figure 8 is a flow diagram showing, how retrieved news arti 
clcs are formatted with reference to a personal news profile and sent to 
a print device interface. 

Figures 9A tn 9B depict a graphical user interface used wi 
th the second embodiment of the present invention. 

Figure 10 is a flow diagram describing the operation of th 
e second embodiment of the invention. 
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1 ABSTRACT 

A World Wide Web site data retrieval system includes an in 
put device far inputting data and commands to access the World Wide Web, 

and a memory for storing a Web site data retrieval driver which include 
s a Web reader, stored Web site address information, stored Web site com 
mands, and stored format information. The memory also stores process st 
eps to connect to a Web site and to issue commands within the connected 
Web site, and a connection to the World Wide Web. The system includes a 

processor for launching the Web site data retrieval driver in response 
to a command to access the World Wide Web. The Web site retrieval drive 
r, upon being launched, (1) launches the Web reader to connect to the Wo 
rid Wide Web via the connection, (2) retrieves the Web site address info 
rmation and Web site commands, (3) instructs the Web reader to access th 
e Web site based on the Web site address information and Web site comman 
ds f (4) downloads Web site data from the Web site based on the Web site 
commands, (5) stores the Web site data in a linear document, (6) repeats 

steps 1 through 5 until all addresses in the stored Web site address in 
formation have been accessed, and (7) formats the linear document into a 

personalized document based on the format information. 

2 REPRESENTATIVE DRAWING 

Pig. 3A 



